OVERSIGHT OF THE 2000 CENSUS: REVISITING 
THE 1990 CENSUS 


HEARING 

BEFORE THE 

SUBCOMMITTEE ON THE CENSUS 

OF THE 

COMMITTEE ON 
GOVERNMENT REFORM 
AND OVERSIGHT 
HOUSE OF REPRESENTATIVES 

ONE HUNDRED FIFTH CONGRESS 

SECOND SESSION 


MAY 5, 1998 


Serial No. 105-159 


Printed for the use of the Committee on Government Reform and Oversight 



U.S. GOVERNMENT PRINTING OFFICE 
50-744 WASHINGTON : 1998 


For sale by the U.S. Government Printing Office 
Superintendent of Documents, Congressional Sales Office, Washington, DC 20402 
ISBN 0-16-057602-4 


AUTHENTICATED 
U.S. GOVERNMENT 
INFORMATION 


COMMITTEE ON GOVERNMENT REFORM AND OVERSIGHT 


DAN BURTON, Indiana, Chairman 

HENRY A. WAXMAN, California 
TOM LANTOS, California 
ROBERT E. WISE, Jr., West Virginia 
MAJOR R. OWENS, New York 


BENJAMIN A. GILMAN, New York 
J. DENNIS HASTERT, Illinois 
CONSTANCE A. MORELLA, Maryland 
CHRISTOPHER SHAYS, Connecticut 
CHRISTOPHER COX, California 
ILEANA ROS-LEHTINEN, Florida 
JOHN M. McHUGH, New York 
STEPHEN HORN, California 
JOHN L. MICA, Florida 
THOMAS M. DAVIS, Virginia 
DAVID M. MCINTOSH, Indiana 
MARK E. SOUDER, Indiana 
JOE SCARBOROUGH, Florida 
JOHN B. SHADEGG, Arizona 
STEVEN C. LaTOURETTE, Ohio 
MARSHALL “MARK” SANFORD, South 
Carolina 

JOHN E. SUNUNU, New Hampshire 
PETE SESSIONS, Texas 
MICHAEL PAPPAS, New Jersey 
VINCE SNOWBARGER, Kansas 
BOB BARR, Georgia 
DAN MILLER, Florida 


EDOLPHUS TOWNS, New York 
PAUL E. KANJORSKI, Pennsylvania 
GARY A. CONDIT, California 
CAROLYN B. MALONEY, New York 
THOMAS M. BARRETT, Wisconsin 
ELEANOR HOLMES NORTON, Washington, 
DC 

CHAKA FATTAH, Pennsylvania 
ELIJAH E. CUMMINGS, Maryland 
DENNIS J. KUCINICH, Ohio 
ROD R. BLAGOJEVICH, Illinois 
DANNY K. DAVIS, Illinois 
JOHN F. TIERNEY, Massachusetts 
JIM TURNER, Texas 
THOMAS H. ALLEN, Maine 
HAROLD E. FORD, Jr., Tennessee 


BERNARD SANDERS, Vermont 
(Independent) 


Kevin Binger, Staff Director 
Daniel R. Moll, Deputy Staff Director 
David A. Kass, Deputy Counsel and Parliamentarian 
Judith McCoy, Chief Clerk 
Phil Schiliro, Minority Staff Director 


Subcommittee on the Census 
DAN MILLER, Florida, Chairman 

THOMAS M. DAVIS, Virginia CAROLYN B. MALONEY, New York 

JOHN B. SHADEGG, Arizona ROD R. BLAGOJEVICH, Illinois 

VINCE SNOWBARGER, Kansas DANNY K. DAVIS, Illinois 

J. DENNIS HASTERT, Illinois 

Ex Officio 

DAN BURTON, Indiana HENRY A. WAXMAN, California 

Thomas B. Hofeller, Staff Director 
Lara Chamberlain, Professional Staff Member 
Kelly Duquin, Professional Staff Member 
David McMillen, Minority Professional Staff Member 


(ID 



CONTENTS 


Page 


Hearing held on May 5, 1998 1 

Statement of: 

Henderson, Wade, executive director, Leadership Conference on Civil 

Rights 245 

Sawyer, Hon. Thomas C., a Representative in Congress from the State 
of Ohio; and Hon. Thomas E. Petri, a Representative in Congress 

from the State of Wisconsin 7 

Stark, Philip, professor of statistics, University of California, Berkeley; 
Kenneth Darga, Ph.D., demographer. Department of Management and 
Budget, State of Michigan; and Jerry Coffey, Ph.D., mathematical stat- 
istician 50 

Letters, statements, etc., submitted for the record by: 

Coffey, Jerry, Ph.D., mathematical statistician, report of the Committee 

on Adjustment of Postcensal Estimates 104 

Darga, Kenneth, Ph D., demographer. Department of Management and 
Budget, State of Michigan: 

Information concerning the censes undercount adjustment 59 

Prepared statement of 97 

Henderson, Wade, executive director, Leadership Conference on Civil 

Rights, prepared statement of 251 

Maloney, Hon. Carolyn B., a Representative in Congress from the State 

of New York, followup questions and responses 32 

Miller, Hon. Dan, a Representative in Congress from the State of Florida: 

Eight guidelines for adjustment 4 

Followup questions and responses 175 

Petri, Hon. Thomas E., a Representative in Congress from the State 

of Wisconsin, prepared statement of 20 

Sawyer, Hon. Thomas C., a Representative in Congress from the State 

of Ohio, prepared statement of 10 

Stark, Philip, professor of statistics, University of California, Berkeley, 
prepared statement of 53 


(III) 




OVERSIGHT OF THE 2000 CENSUS: 
REVISITING THE 1990 CENSUS 


TUESDAY, MAY 5, 1998 

House of Representatives, 
Subcommittee on the Census, 
Committee on Government Reform and Oversight, 

Washington, DC. 

The subcommittee met, pursuant to notice, at 3 p.m., in room 
2247, Rayburn House Office Building, Hon. Dan Miller (chairman 
of the subcommittee) presiding. 

Present: Representatives Miller, Davis of Virginia, Shadegg, 
Snowbarger, and Maloney. 

Staff present: Thomas Hofeller, staff director; Thomas Brierton, 
deputy staff director; Jennifer Safavian, chief counsel; Lara Cham- 
berlain and Kelly Duquin, professional staff members; David 
Flaherty, senior data analyst; Michelle Ash, minority counsel; and 
David McMillen, minority professional staff member. 

Mr. MILLER. Good afternoon. We’ll get this hearing underway. 
First, I ask unanimous consent that all Members’ and witnesses’ 
written statements be included in the record. Without objection, so 
ordered. 

This afternoon, we’ll have opening statements by Congress- 
woman Maloney and myself, and then we’ll proceed immediately to 
the first panel. 

This is our second hearing of the Census Subcommittee. The con- 
cern we have — is that we are moving toward a failed census. The 
General Accounting Office has given us warnings consistently each 
time they’ve given a report, the most recent one being in March, 
that the risk of a failed census has increased. The Inspector Gen- 
eral has given us a warning that the plan that has been proposed 
for the year 2000 census — I call the largest statistical experiment 
in history — is a very risky endeavor. 

The census is something that is extremely critical and, as we get 
closer to the census, I think it will become even more evident to 
Americans because it is fundamental to our elected Democratic 
forum of government. Most elected officials in this country are de- 
pendent upon a census: school boards members, county commis- 
sion, city council, State legislatures, and Congress, of course. If we 
have a census that fails, we are threatening our Democratically- 
elected system of government. But we also have to have a census 
that the American people trust. If we have a census that is not 
trusted, we are threatening, the way we operate in this country. 
The skepticism in this country would greatly increase. 

(l) 



2 


Today, the focus is going to be on the 1990 census and looking 
at what worked, what didn’t work, and what we should learn from 
that experience. The 1990 census consisted, as we know, first as an 
enumeration where we tried to count the entire population of this 
country. It counted 98.4 percent of the people, the second best cen- 
sus in history — not a bad number, actually. Some people may even 
think it’s the best census we’ve had. 

After they did the enumeration, a sample was conducted of ap- 
proximately 150,000 households that was going to be used for ad- 
justment. What we know happened in 1990 was that sampling was 
a failure. Secretary Mosbacher considered the option of using sam- 
pling for adjustment, and he rejected it. The recommendation from 
the Census Bureau was based on adjustment; they wanted to take 
a congressional seat away from Wisconsin and a seat away from 
Minnesota. After Mosbacher rejected that recommendation, in 1992 
they realized there was a computer mistake, and it never should 
have been a recommendation. It would have been done after the 
fact if Secretary Mosbacher had made the decision to eliminate a 
seat from both Pennsylvania and Wisconsin. 

The Census Bureau has acknowledged that the information from 
the 1990 census was less accurate for population areas of under 
100,000 people, so anything with less than 100,000 people was sta- 
tistically less accurate. Now that means all census tracts— munici- 
palities, counties, and all of less than 100,000 people— had less ac- 
curate information than if you’d adjust it. The census tracts — cen- 
sus blocks — are the cornerstones of how you build up congressional 
districts, city council districts, and school board districts. The idea 
of trying to use something less accurate as the foundation was, to 
me, a little unbelievable that they’d even attempted to do it. 

The Census Bureau felt the sampling that took place after the 
1990 census was so inaccurate that it would not be used in any 
intercensal analysis- — that is when you adjust the census between 
1990 and the year 2000, and they did not use the sampling that 
was done back in 1990. 

One of the concerns that many people had, is that the Census 
Bureau was actually deleting people from counts. They would go 
through a census block or a census tract — and delete people; people 
that were not necessarily double counted or should not have been 
counted, they just would delete them to say, on average, they 
shouldn’t exist. 

Well, what’s been proposed for the year 2000 census, is, first of 
all, they’re not even going to do a full enumeration to start with. 
They’re only going to count 90 percent of the population. We have 
no fallback position. This means they’re going to totally rely on 
sampling in year 2000. They are not going to attempt to do a full 
enumeration because they decided adjustment and sampling is the 
only way to go, and yet sampling was the failure in 1990. The plan 
now is to count 90 percent, and then they’ll do a sample after that 
of 750,000 households. That’s about five times larger than 1990, 
and they’re going to allow half the time to do it. Now, we’re going 
to count twice as many households in half the time and in year 
2000, they’re going to use a less experienced work force. Instead of 
using census employees, they’re going to use part-time workers. It’s 
an unrealistic goal to achieve, and that is part of the reason we’re 



3 


moving toward failure. I am concerned that the administration is 
pushing more political science than it is statistical science or em- 
pirical science. 

We had some problems in 1990, and that’s what we’re here to 
learn about. We need to come up with how to go about addressing 
those problems of undercount and do a better job. The Census Bu- 
reau, I think, is moving in the right direction by correcting some 
of those problems. For example, we know that 50 percent of the 
error in 1990 is related to the address list, and the Census Bureau 
has recently asked for a supplemental appropriation of $100 mil- 
lion to help address that issue. They are using, in this case, better 
marketing techniques and I think that’s very helpful. 

In 1991, when Secretary Mosbacher was addressing the issue of 
whether to use adjustment or not, he had eight guidelines. I think 
there are copies of those available, and I think it’s worthy of look- 
ing at those guidelines in evaluating whether adjustment should be 
used. 

[The guidelines referred to follow:] 



Mosbacher Eight Guidelines For Adjustment 


1. The Census shall be considered the most accurate count of the population of the 
United States, at the national, State and local level, unless an adjusted count is 
shown to be more accurate. The criteria for accuracy shall follow accepted 
statistical practice and shall require the highest level of professional judgment from 
the Bureau of the Census. No statistical or inferential procedure may be used as a 
substitute for the Census. Such procedures may only be used as supplements to 
the Census. 

2. The 1990 Census may be adjusted if the adjusted counts are consistent and 
complete across all jurisdictional levels: national. State, local, and census block. 

The resulting counts must be of sufficient quality and level of detail to be usable 
for Congressional reapportionment and legislative redistricting, and for all other 
purposes and at all levels for which census counts are published. 

3. The 1990 Census may be adjusted if the estimates generated from the pre-specified 
procedures that will lead to an adjustment decision are shown to be more accurate 
than the census enumeration. In particular, these estimates must be shown to be 
robust to variations in reasonable alternatives to the production procedures, 

and to variations in the statistical models used to generate the adjusted figures. 

4. The decision whether or not to adjust the 1990 Census should take into account the 
effects such a decision might have on future census efforts. 

5. Any adjustment of the 1990 Census may not violate the United States Constitution 
or Federal statutes, if an adjustment would violate Article I, Section 2, Clause 3 
of the U.S. Constitution, as amended by Amendment 14, section 2, or 13 U.S.C. 
section 195, or any other constitutional provision, statute or later enacted legislation, 
it cannot be carried out. 

6. There will be a determination whether to adjust the 1990 Census when sufficient 
data are available, and when analysis of the data is complete enough to make 
such a determination. If sufficient data and analysis of the data are not available 
in time to publish adjusted counts by July 15, 1991, a determination will be made 
not to adjust the 1990 Census. 

7. The decision whether or not to adjust the 1990 Census shall take into account the 
potential disruption of the process of the orderly transfer of political representation 
likely to be caused by either course of action. 

8. The ability to articulate clearly the basis and implications of the decision whether 
or not to adjust shall be a factor in the decision. The general rationale for the 
decision will be clearly stated. The technical documentation lying behind the 
decision shall be in keeping with professional standards of the statistical 
community. 




5 


Mr. Miller. Let me just comment about a couple of these. This 
first one the census shall be considered the most accurate count of 
the population of the United States at the National, State, and 
local level unless an adjusted count is shown to be more accurate — 
unless. The burden of proof is on the change. If they’re going to 
change to those radical new ideas, the burden of proof is on them 
to prove that it worked, and in 1990, it was a failure. I think it’s 
irresponsible, especially considering that we’re dropping the idea of 
counting everybody in a full enumeration. We have to go to an ad- 
justed account without having a fallback position. 

Then, the second point is that the 1990 census was adjusted. The 
adjusted counts are not consistent and complete across all jurisdic- 
tional levels: National, State, local, and census block. Well, the 
Census Bureau, itself, acknowledges that counts under 100,000 are 
less accurate. 

Another point was that the decision on whether or not to adjust 
the 1990 census, should’ve taken into account the effects of such a 
decision on future census efforts. The concern that I have there is 
the mail response rate. That’s one of the keys that we need to have 
a successful census. And we know right now in Sacramento and Co- 
lumbia, SC, where the dress rehearsals are taking place, the mail 
response rate is below 50 percent. The response to that usually is, 
“Well, that’s a dress rehearsal, and people know it doesn’t really 
count, and that’s the reason the response is less.” If people under- 
stand that all we’re going to do is sample, why complete a question- 
naire? We’re going to lower the response rate by mail, once people 
know we’re going to adjust the census. So we’re really threatening 
future census efforts if we start using sampling right off the bat. 

And one final comment, the ability to articulate clearly the basis 
and implications of the decision whether or not to adjust shall be 
a factor in the decision. The general rationale for the decision will 
be clearly stated. The idea is; how do you explain to a community 
that have people deleted from the counts? That the Census Bureau 
goes in there and honestly counts the population? The Census Bu- 
reau here in Washington says, “We’re going to reduce your popu- 
lation, not because of duplication in people being counted, but just 
because, statistically, there’s an average, and we think you should 
be deleted.” That happened when they tried to do adjustment in 
1990. That’s going to be very difficult to explain. What we do know 
about 1990, and will hear more about and discuss today, is that 
sampling was a failure. Trying to use sampling and totally rely on 
sampling without a fallback position in year 2000 is, in my opinion, 
irresponsible. Sampling is not ready for the prime-time. We need 
to do a full enumeration and continue to work on this effort. 

And with those statements, and before we begin, I would like to 
call upon the ranking member, Mrs. Maloney. 

Mrs. MALONEY. I’d like to thank very much the chairman for 
yielding, and I’d also like to very much welcome two of my col- 
leagues, Congressmen Sawyer and Petri. I look very much forward 
to your testimony. 

Much of what we know about the 1990 census is a direct result 
of the work done by Congressman Sawyer’s subcommittee. Indeed, 
his subcommittee also laid much of the groundwork for the 2000 
census. He was among the few Congress Members who understood 



6 


that oversight of the census is a decade-long responsibility, not 
something that can be done in the last 2 or 3 years before the cen- 
sus. 

I would also like to welcome Wade Henderson, the executive di- 
rector of the Leadership Conference on Civil Rights. I am sorry 
that he is the last witness that we will have today, and I do hope 
that we will get to his testimony before the 5 o’clock scheduled 
votes. 

For some, the 1990 census was a success. If you are white and 
living in the suburbs, the census did a good job of counting you and 
your neighbors. For many, however, the 1990 census was a failure. 
For urban and rural blacks, the census was a failure. For whites 
living in rural rental housing, the census was a failure. For poor 
Hispanics in urban, suburban, or rural areas, the census was a fail- 
ure. The census was a failure for these people because a large per- 
centage of them were left out. 

Today, we will hear testimony from three scholars about why the 
attempts to fix the 1990 census did not work. I hope they will also 
address how we make sure the same mistakes are not made again 
in the 2000 census. The 1990 census failed both the public and 
Congress, and we simply cannot let that happen again. 

I know there has been a great deal of partisan discussion and de- 
bate regarding the 2000 census, but now I would like to really, in 
a bipartisan way, reach out and really complement the question 
posed by my Republican colleague. Representative Harold Rogers, 
when he testified before Sawyer and Petri at a hearing. Represent- 
ative Rogers asked, in reference to the 1990 census, and I quote. 
Were the methods for counting our population, while learning more 
about it, outmoded? In light of existing sampling techniques, they 
were, end quote. I agree with Representative Rogers. I agree with 
Representative Porter Gross, who took to the House floor on Sep- 
tember 25, 1992, and he said — this Republican elected official with 
whom I agree. And I quote, from Porter Goss, quote, If the data 
are adjusted, four million people not included in the official 1990 
census will be acknowledged, and the statistics will be truly reflec- 
tive of the actual population of the United States, end quote. 

The fact that the attempts to fix the 1990 census can be con- 
strued to have failed is all the more reason we must work harder 
to see that there is a system in place to correct these inequities in 
2000. Some seem to be saying that since the plan to adjust the cen- 
sus in 1990 was not perfect, we should simply do nothing in 2000. 
I am glad these people weren’t in charge of our space program. 
After Apollo 13, they would have folded their tents and run for the 
hills. 

I urge all of our witnesses to be mindful of the consequences if 
the 2000 census is a failure. 

For Congress, it will be an embarrassment, although I am sure 
that there are many here who would prefer that we did not redis- 
trict the Congress in 2001. For the public, an inaccurate census is 
a travesty. Representation will be misallocated, and Federal funds 
will be distributed in excess to the wealthy and with scarcity to the 
poor. It is our responsibility to get the most accurate census. 



7 


The National Academy of Sciences has come out in favor of sam- 
pling, as has the Census Bureau, as being more accurate and cost- 
ing less. 

I am very pleased that two of my colleagues who have worked 
very hard on this issue, both in this Congress and in prior Con- 
gresses, are here. I look forward to Representative Sawyer’s and 
Representative Petri’s testimony. 

Porter Goss 

Mr. Miller. Let the record show it was Congressman Porter 
Goss 

Mrs. Maloney. Porter Goss. 

Mr. Miller [continuing]. And not Peter Goss, yes. Thank you. 

We’ll have our colleagues Congressman Sawyer and Congress- 
man Petri, if you’d come forward, and we appreciate your being 
here today as we — it was actually the suggestion of Congress- 
woman Maloney — that we have you here because of your experi- 
ence and knowledge from 1990, and I’m glad we have the time 
which wasn’t available, at the first hearing when we wanted to 
focus on the dress rehearsals. 

Congressman Sawyer, both of you, your official statements will 
be put in the record, if you’d like to begin. 

STATEMENTS OF HON. THOMAS C. SAWYER, A REPRESENTA- 
TIVE IN CONGRESS FROM THE STATE OF OHIO; AND HON. 

THOMAS E. PETRI, A REPRESENTATIVE IN CONGRESS FROM 

THE STATE OF WISCONSIN 

Mr. Sawyer. Well, thank you very much, Mr. Chairman. Thank 
you for this hearing. Thank you, Congresswoman Maloney, for your 
part in helping make this possible. I’m going to try to truncate my 
testimony because it’s simply too long to read. But, let me begin 
by saying I’m not going to engage in a jeremiad about sampling, 
although, if you have questions, I’d be pleased to discuss them. I 
think that much of what you have said, Mr. Chairman, is true. I 
think some is a misreading, but that, nonetheless, is a matter of 
difference of opinion. 

What I’d like to do this afternoon is to go through the kinds of 
difficulties that were encountered in 1990 because, I think, they’re 
instructive for the period that we’re in right now. I think it’s impor- 
tant to understand that problems can be detected during the dress 
rehearsal, but often those problems underestimate what will actu- 
ally happen during the actual count. A dress rehearsal is much like 
the war games that every military force on Earth undertakes, but 
the chaos of war is a very different matter. 

Trying to count the Nation, in a matter of weeks, requires an 
enormous amount of flexibility and capacity to adjust to change as 
it occurs. In that sense, the 1990 census encountered operational 
problems almost from the very start. In March, when they mailed 
out some 90 million forms across the United States, newspapers 
began to — and local officials — began to report that they were not 
fully delivered, in fact, although it was only about 4 million that 
were undelivered. And that undeliverable rate is relatively small 
for such a large mailing. Public confidence was shaken considerably 
and began to play itself out in other ways. 



8 


A more fundamental problem with the census became apparent 
quickly. Instead of having the 70 percent mailed-back rate or the 
75 percent hoped-for rate, the census encountered a mail-back rate 
of what, I believe, was under 65 percent. In some neighborhoods, 
that response rate hovered around 30 to 40 percent, not unlike 
some of the kinds of things that are being encountered in the dress 
rehearsal today. 

That caused particular problems because it left the Bureau with 
a 30 percent greater workload for the door-to-door followup work 
than it had planned for, in terms of time, money, and work force. 
The fieldwork took more than twice as long, some 14 weeks instead 
of the 6 weeks that had been planned. It took 6 weeks alone just 
to gather the information on the final 10 percent of non-responding 
households. 

Much of the information, therefore, was of dubious quality. The 
further removed from the time of the actual census date that the 
information is collected, the more it deteriorates. In some cases, the 
efforts of census takers to gather information directly from house- 
holds were futile. This lead to a collection of data from surrogates 
and includes letter carriers, neighbors, building managers, or peo- 
ple that were encountered on the streets. The GAO noted that 3.2 
percent of the Nation’s occupied housing units, about 7 million peo- 
ple, were included in the census based on information collected in- 
directly. In some urban areas, these last-resort procedures were 
used at more than twice the national rate. Clearly, the Census Bu- 
reau struggled to count the last 10 percent of the households lead- 
ing to a disproportionate amount of non-sampling error in the 
hardest to count communities. 

Not surprisingly, as a result of all this, the Bureau ran out of 
money long before the census was finished. It cost at least $10 mil- 
lion dollars to visit every 1 percent of households that didn’t re- 
spond by mail. And, it cost more than twice that much as census 
takers made visit after visit to the hardest to count, final 10 per- 
cent. 

Second, the Bureau had to hire more enumerators and keep local 
census offices open longer for an emergency appropriation of about 
$110 million dollars in order to get the job done. 

In a second large area, maintaining an adequate workforce of 
qualified enumerators, even with the more difficult economy that 
the Nation had in 1990, quickly became a problem as well. Because 
of the unexpectedly large workload in the door-to-door phase, they 
had to recruit and train many more temporary workers to meet the 
hiring needs. The Bureau was forced to increase its pay rates at 
the same time. These problems compounded one another and cre- 
ated what was widely regarded as a failure, as you noted earlier. 

It was the first census in modem times that yielded less accurate 
results than the previous decade. Its costs escalated significantly, 
despite the best efforts to eliminate the persistent and dispropor- 
tionate undercount of urban and poor minorities. The census, 
again, had failed to reduce the number of those who it missed. The 
undercount was also significantly higher than in 1980. In fact, the 
number of minorities missed in 1990 was greater than the total of 
all people missed in 1980. That difference — that inequality — was 
still quite unacceptable. 



9 


I think everybody in the room would agree that we can’t let this 
happen again. I’ve taken the view, both at the time that it was 
going on and after considerable analysis afterward, that the 1990 
census was not so much a failure of execution as it was a failure 
of design, a 30-year-old design whose roots were grounded in the 
1960’s, that had simply outgrown our Nation. Today, the rate of 
change in this country is more profound and deeper and more dif- 
ficult to deal with in a larger nation than anything that was antici- 
pated in 1960. 

Today, we are a Nation on the move. Poor people, in general, 
move around a lot. Growing numbers of them are homeless or not 
tied to a permanent address. Migrant farm workers and construc- 
tion workers have created problems that were difficult to antici- 
pate. Even upper middle-class people are highly mobile today, and 
wealthy people are multi-residential. It comes down to this; tradi- 
tional counting methods based on house-grounded census tech- 
niques can no longer, by itself, fully accommodate a changing, tran- 
sient population. 

Some people believe that advertising and promotion and outreach 
will solve the problem, and it is important. It must done. But I’m 
not convinced that it will significantly reduce — much less elimi- 
nate — the undercount. 

Even after the emergency appropriation of $100 million, the 
count still yielded a disgraceful, disproportionate undercount. 

The Census Bureau’s sampling plan, as you suggest, is not per- 
fect. Make no mistake about it, however, population numbers pro- 
duced by traditional counting methods are rife with error. They 
may look precise, but they are too often precisely wrong. Accuracy 
is the real question that we need to pursue. 

I believe that it’s a mistake to force the Census Bureau, ahead 
of time, to continue to use counting methods that have proven, dec- 
ade after decade, to yield poor and deteriorating results at high 
costs, when we have the potential to have sound science produce 
a better result. 

Mr. Chairman, just in conclusion, let me say that it’s reasonable 
to have concerns about whether or not the Bureau is sufficiently 
prepared for 2000. But at this point, in the decennial cycle, there 
are bound to be uncertainties, bound to be procedures that still 
need to be refined and decisions yet to be made. That’s simply the 
nature of such a complex undertaking. 

It’s my hope that the Census Bureau and the subcommittee will 
welcome one another’s help, will work together as partners to en- 
sure the most accurate possible count for our Nation. Without a 
constructive partnership with the Congress, the census is, indeed, 
doomed to a repeat performance of 1990. 

Thank you very much for the chance to be here today, Mr. Chair- 
man. 

[The prepared statement of Hon. Tom Sawyer follows:] 



10 


Statement of The Honorable Tom Sawyer 
"Oversight of the 2000 Census: Revisiting the 1990 Census" 

Committee on Government Reform and Oversight 
Subcommittee on the Census 

May 5, 1998 
3:00 p.m. 

Thank you Mr. Chairman, Congresswoman Maloney, and members of 
the subcommittee for the opportunity to share my experiences from 
the 1990 census as the former chairman of the Subcommittee on 
Census, Statistics and Postal Personnel. I am pleased to be here 
and pleased that our former ranking member, Congressman Petri, is 
able to join me. 

The purpose of my testimony this afternoon is to share with 
you some of the problems that the Census Bureau encountered during 
the conduct of the 1990 census. We can expect that many of the 
same difficulties will reoccur during the 2000 census. Potential 
problems can be detected during the Dress Rehearsal but often grow 
in magnitude during the actual count. We must be careful, however, 
not to mistake inevitable uncertainties for problems we expect the 
Bureau to anticipate. 

The 1990 census encountered operational problems almost from 
the start. In mid-March, the Census Bureau mailed approximately 90 
million questionnaires to the households on its address list. 
Within days, local post offices began to report that millions of 
those forms could not be delivered as addressed. The primary 
glitch was caused mostly by rural households which receive their 



11 


mail at a post office box, not a street address normally used in 
urban areas. The Postal Service was unable to deliver four million 
forms that included rural route or street addresses not recognized 
as delivery points for mail. While the "undeliverable" rate was 
relatively low for such a large mailing, public confidence in the 
census was shaken considerably as the problem of missing census 
forms hit the front page of newspapers across the country. 

A more fundamental problem with the census became apparent 
within weeks of the start date. Simply put, fewer households than 
the Census Bureau had anticipated were mailing back their 
questionnaires. Instead of the estimated 70 percent mail response 
rate, only 65 percent of American households bothered to return 
their forms. In some neighborhoods, response rates hovered at 
around 30 to 40 percent, causing despair among city and community 
leaders, and census officials alike. 

This disappointing response left the Bureau with a 30 percent 
greater workload for the door-to-door follow-up work than it had 
planned for in terms of time, money, and workforce. In fact, the 
field work took more than twice as long as the Bureau had planned: 
fourteen weeks instead of six. It took six weeks alone just to 
gather information on the final ten percent of non-responding 
households . 

Consequently, much of the information collected as spring 
turned into summer and summer turned into fall was undoubtedly of 
dubious quality. By virtue of the passage of time since Census 
Day, many households — particularly more mobile, lower income 


2 



12 


populations — were likely to provide inaccurate information about 
who lived there on April 1. In some cases, the efforts of census 
takers to gather information directly from a non-responding 
household were futile. This led to the collection of data from 
surrogates such as letter carriers, neighbors, or building 
managers. The General Accounting Office noted that 3.2 percent of 
the nation's occupied housing units — or about 7 million people — 
were included in the 1990 census based on information collected 
indirectly. In some urban communities, however, these "last 
resort" procedures were used at more than twice the national rate, 
and in 14 local census areas, more than 10 percent of occupied 
housing units were counted in this way. Clearly, the Census Bureau 
struggled to count the last ten percent of households, leading to 
a disproportionate share of mistakes (called "non-sampling error") 
in the hardest-to-count communities. 

Not surprisingly, the Bureau ran out of money long before the 
census was finished. It had cost at least $10 million to visit 
every one percent of households that didn't respond by mail. That 
figure more than doubled for the last ten percent of non-responding 
households, as census takers made visit after visit to gather 
information against the clock. The Bureau had to hire more 
enumerators than it had planned and had to keep local census 
offices open longer than expected. It turned to Congress for an 
emergency appropriation of $100 million to get the job done. 

Maintaining an adequate workforce of qualified enumerators 
quickly became a problem, as well. Because of the unexpectedly 


3 



13 


heavy workload during the door-to-door phase, the Bureau had to 
recruit and train many more temporary workers, a difficult 
prospect, at best. In order to meet its hiring needs in many 
areas, the Bureau was forced to increase its pay rates, adding to 
the escalating cost of the census. 

These problems compounded one another and what we had in the 
end was a census that was widely regarded as a failure. It was 
the first census in modern times that yielded less accurate results 
than the previous decade, even as costs escalated significantly. 
Even more troubling is the fact that despite the Census Bureau's 
best efforts to eliminate the persistent and disproportionate 
undercount of the rural and urban poor and minorities, the '90 
census again failed to reduce the number of those who were missed. 
In fact, the undercount was significantly higher than in 1980. 
More minorities were not counted in 1990 than the total of all 
people missed in 1980. That difference — that inequality — was, 
and still is, unacceptable. 

I think everyone in this room agrees that we cannot let that 
happen again in 2000. Not when we have the scientific knowledge to 
significantly reduce (if not eliminate) the undercount. 

I firmly believe that the 1990 census was not a failure of 
execution, but a failure of design -- a 20 year-old design that has 
outgrown our nation. The Census Bureau did the best job it could 
with the tools it had. Unfortunately, as we later learned, those 
tools could not accommodate a changing population. 

The fact is, we are a nation on the move. But even that 


4 



14 


mobility and its character is changing. Consider impoverished 
populations that are migratory and homeless: poor people move 
around a lot. Growing numbers of people are homeless or are not 
tied to a permanent address. Migrant farm workers and the growing 
numbers of moving construction workers around the country have 
created problems that are difficult to anticipate. Even upper 
middle-class people are highly mobile and wealthy people are multi- 
residential. It comes down to this: traditional counting methods 
are based on house-grounded census technigues that can no longer 
fully accommodate a changing, transient population. 

Some people believe that increased advertising and promotion 
and outreach will solve the problem of the undercount. Indeed, 
paid advertising and increased promotion and outreach may help keep 
the mail response rate at an acceptable level but they cannot — on 
their own, significantly reduce — no less eliminate — the 
undercount. 

Even after an emergency appropriation of $100 million for the 
1990 census, the count still yielded a disgraceful disproportionate 
undercount of minorities and the rural and urban poor. 

From my experience of evaluating the 1990 census, I have come 
to believe that no amount of money that Congress throws at the 
census will count those who are difficult to reach or those who are 
fearful or mistrustful of the government. 

The Census Bureau's sampling plan is not perfect. But make no 
mistake about it: the population numbers produced by traditional 
counting methods are rife with error. They may look precise, but 


5 



15 


they are wrong . It is absolutely irresponsible for Congress to 
force the Census Bureau to continue to use counting methods that 
have proven decade after decade to yield poor results at high 
costs, when sound science will allow us to do better. 


In closing, Mr. Chairman, it is certainly reasonable to have 
concerns about whether or not the Census Bureau is prepared for the 
2000 Census. However, at this point in the decennial cycle, there 
are bound to be uncertainties, bound to be procedures that still 
need to be refined, bound to be decisions yet to be made. That is 
simply the nature of such a complex undertaking. 

It is my hope that the subcommittee will welcome the 
opportunity to work as a partner with the Census Bureau to ensure 
the most accurate count possible for our nation. Without a 
constructive partnership with Congress, the census is doomed to a 
repeat performance of 1990. 


6 



16 


Mr. Miller. Congressman Petri. 

Mr. Petri. Mr. Chairman, members of the subcommittee, thank 
you. It seems like old times. Tom and I somehow ended up in this 
business and had many hearings because he did take, very seri- 
ously, his responsibility as a Member of this House and, at that 
time, as chairman of the subcommittee with oversight responsibil- 
ity over the Bureau of the Census, to conduct extensive hearings 
and, to different aspects of the census, to encourage the Census Bu- 
reau and to refine and improve its procedures for the 2000 census 
and, also, to give a variety of different groups and individuals who 
have concerns about one aspect or another of the census, opportuni- 
ties to air those concerns. And I think there have been fruit al- 
ready from that effort. It’s been a productive effort. 

The census is — believe it or not — a very, very important exercise 
for our country in all kinds of ways. It’s written into our Constitu- 
tion which is unusual — not only do we want to have an accurate 
and updated count for fair political representation purposes and for 
a fair distribution of various formula population-driven funds 
across the country, but the census, and the various long form and 
other parts of the census, provide a wealth of data to industry to 
help our whole economy operate more efficiently than it could with- 
out that information. 

Other countries are struggling to put in place their own versions 
of what we have here, going back to our Constitution. That’s why 
it makes me very sad that we may be careening toward attempting 
to make a massive change in the methodology of the census that 
could — be constitutionally suspect. The Constitution requires an ac- 
tual enumeration, and we’re not quite sure what that means, but 
it could be constitutionally suspect on a partisan basis, and that’s 
bad. I think we should attempt to avoid, to the extent we can, 
doing departures — we’ve done it for 200 years making changes that 
are not based on, at least, fair consensus of support or tolerance 
across the political spectrum and among the parties. 

I think it’s bad to criticize the census, and unfortunately, that’s 
been happening by this change. I hope at a minimum, that as we 
go forward with the census, if somehow the agreement cannot be 
resolved and the Census Bureau attempts to adjust it, as they did 
after the last census, that provisions be made to conduct a com- 
plete census and then adjust it. Should the census adjustment not 
be allowed when challenged in court, we still would have a census 
that we could rely on. The country could move forward in an accu- 
rate way rather than, basically, foreclosing a realistic constitutional 
test giving the court the option of throwing the country into chaos, 
in some respects, or going along with the adjustment, even if they 
don’t feel the Constitution actually allows that for the basic census. 
An actual enumeration, I think, a lot of people feel meant a head 
count. And there was a reason for doing that, and that was, that 
the Founding Fathers and a lot of other national experiences have 
been that this, when it’s politicized, the numbers get manipulated, 
however the veneer or whatever the veneer, and I think that’s a 
legitimate suspicion. 

I lived for a couple of years in the country Somalia where the dif- 
ferent tribal weights were obviously very, very important. And they 
were so important, they would not allow a census. They all just 



17 


sort of argued how big they were, and it was sort of bargained out 
politically. 

So the idea of resort to fact, at the end of the day, is important. 
Just as in an election, we don’t adjust or have a poll. It may be 
an unfair election. Different elements of the community may not 
have turned out as much as proportionately it would be indicated. 
But when the ballots are cast and they’re counted, that’s what de- 
termines who won the election. And we don’t adjust it; however, 
some may feel it’s unfair. What we do, is keep trying to make ef- 
forts to have broader involvement, outreach, get people to vote. 
And, I think the idea of reaching out and using this to get people 
to participate, as an active citizenship, in the census is very impor- 
tant, and there are a lot of things we can agree on in that regard. 

I think a major public information campaign leading up to the 
census that could, in part, be funded for a TV special explaining 
why this is an active census, a part of your duty as a citizen; why 
the census is important; that, in fact, the results are confidential, 
by law, and cannot be used against any individual who fills out the 
form. They will not — they cannot be used in court, or in any other 
way, to compromise their activities. And we felt the census is so 
important that information is set aside and not allowed to be used 
in court, or in any other way. We’ve had a number of hearings on 
that to make sure local officials would not use census data; for ex- 
ample, if too many people were in a building, zoning violations, 
things like this. They can’t use the census for that purpose. We 
need the information, and it’s important for our country to have 
that information, and we’re willing to sacrifice this particular way 
of getting information for other purposes. 

The idea of trying to let people in undercounted communities 
work as census enumerators, without that income being counted 
against the amount that they would get for welfare or other pay- 
ments, has been explored. Representative Meek has suggested that, 
and I think that’s a good idea. 

I think we worry about the undercount in minority communities, 
in communities with a high percentage of new entrance into the 
country. There’s a tremendous undercount among taxpaying and, 
in many cases, voting Americans living around the world. Several 
millions of Americans live outside the United States and are not, 
today, counted. And I think that that should be added to the litany 
of people who need to be counted, because the world is changing. 
More and more people are going to be traveling and working and 
retiring outside of their township, or their State, or their country, 
and procedures need to be put in place to attempt to count those 
American citizens. We have that data — in most cases, I think, or 
at least a lot of it — over at the State Department now. People have 
to get passports to travel, and they get visas to travel. If you’re 
talking about an adjustment, you must at least try to reach out, 
or at least mail to those people, there doesn’t seem to be any effort 
to adjust in that regard. But, I will be suspect about the limited 
nature of the proposed adjustment, when they’re not even attempt- 
ing to reach out for a large number of people who, we know, are 
there and should be included. 

In my State of Wisconsin, we had the highest participation in the 
last census, so far as returning the forms voluntarily, of any State 



18 


in the country. Over 75 percent filled out the forms and returned 
it. It didn’t just happen. I, as a Representative, mayors, our Sen- 
ators, our Governor, other local officials did repeated public service 
announcements, letters to the weekly columns outlining to people 
the importance of this census and that this was a duty of citizen- 
ship. 

We talk a lot about our rights as Americans. We do have a few 
responsibilities, and this is one. I would be very worried, that once 
people realized that they could go to an adjustment, you would see 
compliance plummet, and you would see inaccuracy multiply. This 
is another example of, sort of, the “dumbing-down” of America, if 
you will, if we’re not willing to ask American citizens to do the 
least bit to help their country and to be sure they’re fairly rep- 
resented. It benefits them; it only takes a couple of minutes, and 
it’s private. I think people have an obligation to, participate as citi- 
zens in this country and to help make the society work, and work 
accurately. I should say in Wisconsin — our mayor in Milwaukee, 
Mayor Norcrest, made a special effort, had the employees of the 
city government participate actively in helping the Census Bureau 
identify people. 

I think the Census Bureau could work with the post office, 
maybe even figure out a way of seeing if postal employees would 
like to volunteer to be enumerators in overtime, in exchange for 
some payment, because they’re delivering the mail all over America 
everyday, and they have a pretty good idea of who lives where. And 
they could be enumerators in their own time, not as postal employ- 
ees; but if they volunteered to do that, I think there could be an 
outreach effort there, and that would improve the accuracy of the 
census enormously. And those researchers are right within our own 
hands. 

So, there are a lot of things that we could do to increase public 
awareness, and to increase public participation, and to make sure 
regardless of whether we adjust or not. And I hope we don’t, be- 
cause I think it would undermine the integrity of the census. But 
even if we do, be sure you do a complete census, and then, if you 
want to adjust it, because otherwise, if it turns out to be unconsti- 
tutional — I know there’s a court case going forward, but that’s be- 
fore the fact, and the courts normally will not get into that kind 
of thing. But, after the fact, the last census was challenged. And 
this census will presumably be challenged, whether they adjust or 
don’t adjust. And you’re going to prejudge that and make a — you 
know, if it predetermines the outcome if you do not go forward in 
a way that, if the court decides the actual enumeration means ac- 
tual enumeration for purposes of elected office, if they decide the 
statute that’s on the book that requires an actual enumeration for 
purposes of redistricting is the law of the land, and enforce it in 
court, and you have not done an actual enumeration to the best 
that you can, you’re going to create potential chaos, or else pre- 
judge a constitutional case. 

So that’s, basically, my pitch, and I wish you well. [Laughter.] 

I hope you can lower the partisan rhetoric and see where we can 
agree, and build on that agreement, because we do want to — we 
have a — this is an important thing, and we want it to be done as 
right as we can for the country. And that’s the best I can say. 



19 


[The prepared statement of Hon. Thomas Petri follows:] 



20 


The Honorable Thomas Petri 
Subcommittee on the Census -May 5, 1998 

I am pleased to appear before the Subcommittee on the Census to discuss the 1990 
Census. I served with my good friend from Ohio, Mr. Sawyer, as the Ranking Member 
on the Subcommittee on the Census in the 103rd Congress and have been interested 
in the census process since that time. 

I believe that before we make any policy decisions for the 2000 decennial census, we 
must take a hard look at the Census Bureau's operation of the 1990 Census. Part of 
this operation included conducting a postenumeration survey (PES), a survey with 
dramatically flawed statistical results. These flawed results add to my concern about 
the Census Bureau's plans to conduct this same type of procedure in 2000 on a larger 
scale in half the time. 

I believe that my state of Wisconsin, would have had a congressional seat taken had 
the sampling adjustment to the 1990 Census been implemented. Luckily for the people 
of Wisconsin, the Supreme Court ruled that the sampling adjustments were too 
inaccurate to have been used for reapportionment of seats and we were not penalized. 

In 1990, Wisconsin had the highest voluntary census mail response in the country. Let 
me take a moment to discuss the efforts made by Wisconsin in 1990 to promote the 
census. We had a statewide public awareness plan as well as extensive grass-roots 
efforts to work with the Census Bureau to make sure that accurate address files were 


available to use for questionnaire distribution. I was personally involved in the 
concerted efforts made by the people and the local governments of Wisconsin which 



21 


brought our response rate to 75%, a rate far above the national average of 65%. 

I believe that we need to use the lessons learned from Wisconsin when we look to the 
2000 Census, We should not artificially inflate population counts for some areas by 
deleting people who made an effort to fill out their forms. Instead, as we did in 
Wisconsin in 1990, we should make every effort to add resources and enhance 
methods for enumeration. 

Some such common sense efforts include having the Census Bureau work with 
local governments to construct the best possible address files and strengthening 
partnerships with the U.S. Postal System. I also suggest we use some creativity to 
capture the missing addresses. For example, we could check State Department records 
to document overseas individuals. In a global economy, efforts like these made by the 
Census Bureau are becoming increasingly important. 

Additionally, we should be stressing the mandatory and confidential nature of the 
Census to promote a higher response rate. We should not instigate a downward 
spiraling of participation by promoting a partial count. Furthermore, I am concerned that 
the Census Bureau will not allow local governments to question the accuracy of the 
census count against their records. This was used by Markesan, Wisconsin in 1980. 

I will be more than happy to answer any questions the Members of the Subcommittee 
may have for me. 



22 


Mr. Miller. Thank you all. Thank you very much for your com- 
ments. I agree; it’s unfortunate it is more partisan than politicized 
to a large extent. I don’t know how it was back in 1990, 1991, 1992, 
I wasn’t here. I was first elected in 1992. 

Mr. Sawyer. I can answer that, Mr. Chairman. 

Mr. Miller. Yes; would you? [Laughter.] 

Mr. Sawyer. The census is always a difficult political contest be- 
cause the stakes are so high. But in 1990, we worked very hard — 
Tom Ridge and I occupied these counterpart positions at that time. 
We worked very hard not to prejudge the question of whether or 
not the census ought to be adjusted. We felt that that was some- 
thing that ought to be left to the scientists, to the professionals, to 
the demographers and statisticians, and ultimately to the Director 
of the Census and to the Secretary of Commerce. We recognized 
that, while it is the constitutional responsibility of the census to 
conduct the count in such a way as the Congress shall by law di- 
rect, that it would probably be a mistake to try to direct that tech- 
nique by a show of hands on the floor of the House, so we did not 
do that. I was under a good deal of pressure from one side; Tom 
was under a good deal of pressure from the other. We were able 
to refrain from that and to allow the count to carry itself out in 
a way that was least disruptive of the plan that the Bureau had 
taken into this enormously difficult undertaking. 

Mr. Miller. Now, I think I heard that you were critical of the 
administration for their cooperation with Congress back in 1991 
and such, which we are having that concern today. One of the con- 
cerns that I have 

Mr. Sawyer. I really wasn’t. I mean I was not 

Mr. Miller. Well then, great. [Laughter.] 

Mr. Sawyer. The only point I was critical of was, after the fact, 
when it became quite difficult to get the Commerce Department to 
come and present information, but that was after the fact 
when 

Mr. Miller. OK. 

Mr. Sawyer [continuing]. It could not be harmful to the conduct 
of the census. 

Mr. Miller. One of the concerns I have is that the administra- 
tion has unilaterally, dramatically, radically, changed the system 
this time around because they’re not doing an enumeration. As 
Tom was saying, back in 1990 the decision of adjustment was in 
1991, after that. But now, there’s no opportunity, no fallback posi- 
tion, and they’ve never come to us to even ask. They’re moving full 
speed ahead with this plan, regardless of what Congress has to say 
or think, and we have a hard time getting the information out of 
the administration. There’s a lot of stonewalling going on in the ad- 
ministration. 

Mr. Sawyer. This design was put in place, not by a Democratic 
administration, but under the direction of Dr. Bryant, who was the 
Census Director under President Bush. It was in response to the 
enormous difficulties that she had encountered in trying to carry 
out the 1990 plan which is essentially, as I mentioned, a 30-year- 
old plan. Grounded as it was — and the mail-out/mail-back tech- 
niques that were put in place in the 1960’s, they don’t work as well 



23 


today as they did in the 1960’s, and it was that, I think, that she 
was responding to. 

Mr. Miller. For example, I don’t know how many details you’re 
in on this current 2000 plan, but the 2000 plan was just released 
to Congress last year. It may have started in theory, with Dr. Bry- 
ant, but we’re really starting to get the information today. 

One of the concerns, for example, is that they did sampling of 
150,000 households back in 1990. This year they’re talking — in 
2000, they’re talking about 750,000 households, but they’re going 
to do it in half the amount of time. They’re going to have a sample 
five times larger and do it in half the time. In 1990 they used the 
professional staff of the Census Bureau; now they’re going to use 
the part-time help. So you use less experienced help, and so you 
say, “Wait a minute; can it be accomplished?” I would think, as 
you’ve mentioned, that you had concerns they couldn’t complete it 
back in 1990 with only 150,000 households; now we’re going to go 
five times larger. The concern is to design a system, and that’s the 
reason GAO has raised serious doubts. Since you won’t do an enu- 
meration in the first phase, you will have nothing to fallback on. 
That’s the scary thing about this whole system. 

Mr. Sawyer. The thing that I think Dr. Bryant was responding 
to — and let me just add that the release of the report on the plan 
that took place last year, I think, was a good thing. I congratulate 
the majority in having called for that, as it did, but that plan has 
been accessible throughout the decade, and it was available as it 
continued to evolve throughout the decade — to Members of Con- 
gress. I had the availability of it, and Tom did as well. The enor- 
mous difficulty, from my point of view, is that Dr. Bryant was try- 
ing to respond in putting together this plan to the terrible political 
difficulties that come when you have two counts, one number that 
one side advocates and another number that another side advo- 
cates, and it was her view — although I’m not here to defend that — 
and she was very clear about it throughout her term in office, that 
the plan that she wanted to put forward for 2000 should be ground- 
ed in a one-number census, so that you did not have competition 
between two numbers in a sense of winners and losers that would 
yield a political decision, rather than one that was grounded in 

Mr. Miller. Let me 

Mr. Sawyer [continuing]. The statistics and demography. 

Mr. Miller. Let me ask, and my time’s up, but let me ask just 
one final question. 

Mr. Sawyer. Sure. 

Mr. Miller. As Congressman Petri asked, that he felt we should 
at least do the full enumeration so we have something to fallback 
on, you don’t agree with that idea? You think that we should 100 
percent rely on sampling? 

Mr. Sawyer. No, I don’t think we should ever want to 100 per- 
cent rely on sampling. I think the efforts to do the fullest possible 
count that underlie this plan are extremely important. I am tom, 
as you are, as Tom is, about whether or not we simply ought to go 
with a one-number census, as Dr. Bryant proposed, in order to 
avoid the political conflict that took place after 1990 or to go with, 
as you refer to it as — you didn’t use the term — but it’s virtually a 



24 


safety net census that uses two different techniques and that you 
can pick and choose between those at the end. 

It’s a terrible dilemma, but I can tell you that having been 
through the political fight of 1990, 1991, and 1992 that I can cer- 
tainly understand Dr. Bryant’s motive in leaving the professional 
counting techniques internal to the census itself, the career profes- 
sionals within the census, rather than bringing them out and hav- 
ing a political fight among elected officials over which number 
ought to be chosen. 

Mr. Miller. Next, Mrs. Maloney. 

Mrs. Maloney. I want to thank both of you for your testimony. 
We don’t have copies of your testimony, and may I ask staff if they 
could get copies for us now of both Congressman Petri’s and Mr. 
Sawyer’s testimony? 

Mr. Sawyer. We brought copies. 

Mrs. Maloney. Mr. Petri’s, if I could. OK, I’d like a copy of 
yours, too, Tom, if I could. 

Mr. Petri. Mine is sort of a work in progress. [Laughter.] 

Mrs. Maloney. OK. [Laughter.] 

OK, I’d like to — - 

Mr. Petri. I’ve got the only copy here. 

Mrs. Maloney. I’d like to ask — well, why don’t we make a copy 
so that everybody has a copy? 

Mr. Petri. OK 

Mrs. Maloney. I’d really like to ask both of you whether or not 
you rate the 1990 census as a success or a failure? And, how do 
you measure its successes and/or failures? 

Mr. Sawyer. Want to go first? 

Mr. Petri. Well, I think it’s like a lot of things in life; it wasn’t 
perfect, and it could be improved, but it was certainly within the 
range of the other 18 censuses or 20 censuses that we’ve had since 
the Republic was founded. There have always been various prob- 
lems with the census and different populations; it’s nothing new, 
but if there are ways that we can actually improve it, we ought to 
do it. I’m just concerned that, for example, one of the things that 
we were able to do in the 1990 and the 1980 censuses, we will not 
be able to do if we go to a complete adjustment approach. That is 
to involve State, and local, and school board, and other local offi- 
cials in correcting error. If the numbers and the tract numbers are 
massaged, and there’s no reference to objective reality, and you’re 
on a school board or you’re on a city council, there’s no way you 
can challenge and correct the number for your city, or town, or 
whatever. Now you can, because they do a headcount; they send 
the figures back to the local units of government; they look at them 
and they say, “Hey, wait a minute.” 

In one town in my district they missed a whole ward. It’s a little 
town of 3,000, and the Census Bureau said there were only 2,200 
people in the town. Town officials knew that wasn’t right. And so 
the local officials were able to go in and document the discrepancy 
and prove that the Bureau had made a factual error and had left 
out this ward and get it corrected. And that’s part of the checks 
and balances and getting people involved at all local levels of gov- 
ernment. If they had mailed the town an adjusted number, what 
could they have done? The Bureau would have said, “Well, we ad- 



25 


justed it, and this is not an accurate number; we’ve pulled some 
people out of there because you were over-represented somehow.” 

And they are talking about adjusting downward and upward. In 
Wisconsin, if they had been adjusted downward, as well as upward, 
when the last adjustment was considered we would have had our 
actual count reduced, and I don’t think that’s going to lead to pub- 
lic confidence in the system. 

Mrs. Maloney. Mr. Sawyer, do you think the 1990 census was 
a success or a failure? 

Mr. Sawyer. Well, in some ways it was an extraordinary success 
in that it undertook the largest count ever attempted in this Na- 
tion. But, the truth of the matter is that, as we encounter the kinds 
of problems that we did, that by several critical measures, it was 
the first census in modem times that was less accurate than the 
previous decade. And in some critical measures, particularly in 
terms of the differential undercount, it was an enormous error and 
the largest ever encountered in the entire measurement of that 
particular quality in the census. Let me, also, suggest that the op- 
portunity for local involvement is not diminished, but substantially 
increased, in the 2000 plan. It involves both pre-census and post- 
censal involvement; the capacity to challenge is enhanced rather 
than diminished and, in fact, if it were diminished in meaningful 
ways, I would share the same kinds of concerns. I do share those 
concerns. I think there needs to be powerful local involvement, but 
it’s even more critical that it take place ahead of time, in the devel- 
opment of address lists which was one of the places where great 
difficulty was encountered in the first place. 

Mrs. Maloney. I understand that the 2000 census will not be ab- 
solutely perfect, but do you believe that it will be more accurate 
than the 1990 census? 

Mr. Sawyer. Well, my belief is that if we attempt to redesign it 
on the floor of the House, we will encounter problems that we have 
never anticipated. My belief is that the design proposed for 2000 
is better suited to the era in which it is being used than the 1990 
census was to 1990, and it’s certainly more appropriate than trying 
to reuse the 1990 census in the year 2000. 

Mrs. Maloney. OK, my time is up. Would you like to comment 
on that, Mr. Petri, or not? 

Mr. Petri. Well, I think if we don’t be sure that we go forward 
in a secure way, the chances are we will end up with an enormous 
mess in 2002 or 2003. If we do a pure adjustment, and it turns out 
actual enumeration and existing law requiring redistricting to be 
done on the basis of an actual count should be held to be the law 
of the land, then either we have to do a new census or, I guess, 
stay that redistricting. I don’t know what they would do, at that 
point, if they didn’t have the data that they could work on. 

So, we are heading toward a potential train wreck if we’re not 
careful. And I do think it’s worth — even if it’s inconvenient — trying 
to figure out some way that we can all agree to make sure that we 
have as complete a count as possible. If people feel it to be more 
accurate by adjusting it, well, I’ve never objected to adjusting the 
census for certain purposes because, I think, it probably is more ac- 
curate on a statewide or nationwide basis. But when you get down 
to local units of government, it’s not more accurate. And for elec- 



26 


toral purposes, it just strikes me it’s a violation of the spirit of, 
“one man, one vote,” rather than adjusting results that for some- 
one’s idea of equity if people don’t bother participating. 

Mr. Sawyer. Mr. Chairman, I don’t want to get into a 

Mr. Miller. OK. 

Mr. Sawyer [continuing]. Give and take here, and I know that 
you don’t. I would welcome the chance to respond to some of those 
comments in writing. 

Mr. Miller. OK. 

Mr. Sawyer. I think some of the concerns are well-placed, I 
think some are not. And in any event, it is more than I can do sim- 
ply sitting here going back and forth. I’d be happy to expand on 
any of those things, but I leave it to your discretion if I could sub- 
mit comments 

Mr. Miller. Yes. 

Mr. Sawyer [continuing]. For the record, it would be helpful. 

Mr. Miller. I appreciate it. We do have two other panels of wit- 
nesses — 

Mr. Sawyer. Yes. 

Mr. Miller [continuing]. And we want to make sure we have 
enough time to properly be able to hear from them. 

But at this time, let me call on Mr. Snowbarger. 

Mr. Snowbarger. Thank you, Mr. Chairman. First of all, let me 
thank both Congressman Petri and Congressman Sawyer for being 
here and for being actively involved in the 1990 census. You were 
in Washington dealing with those things; I was in Topeka, KS, in 
the State legislature, as the ranking Republican for reapportion- 
ment and redistricting, as well as on the NCSL, National Con- 
ference of State Legislatures Task Force on reapportionment. We 
were watching post enumeration sampling very, very closely, and 
frankly, very much opposed to it in our State, the State of Kansas, 
and tried to keep that information available to the Census Bureau 
all the way through. I will tell you that having gone through the 
process of drawing the maps for State legislative districts, I want 
to echo the concerns of Congressman Petri, that if you sample — I 
think, particularly for the census block, census tracts — those small- 
er sampling units which, frankly, we use. We broke them down 
that finely. In particular for State House of Representative seats, 
and I’m concerned about the accuracy at that level. 

Let me go to a different line of questioning, though. And, Con- 
gressman Sawyer, it’s my understanding — again, I wasn’t here for 
the debates — but it is my understanding that you were quite a pro- 
ponent of the post-censal local review. Could you just talk a little 
bit about the local review and why you thought that was very im- 
portant? 

Mr. Sawyer. Well, it’s important to have local involvement at 
virtually every level. As you suggest, it is sometimes possible, just 
through administrative oversight, to miss whole units of popu- 
lation. In my district — we all have stories — in my district we had 
an apartment complex that was named after an adjoining commu- 
nity but it was not in that community. So it was deleted from one 
and put in the other. Local communities observed that and pro- 
tested it and that was altered. I think it’s important, however, to 
point out that when we talk about small area inaccuracy, we’re not 



27 


talking about 100,000 level. For the most part, we’re talking about 
census block levels. We don’t draw districts that are the census 
block size. We don’t do virtually anything with census block 
size 

Mr. SNOWBARGER. No, you aggregate 

Mr. Sawyer. You aggregate 

Mr. SNOWBARGER. No, you aggregate 

Mr. Sawyer. You aggregate them and the errors tend to cancel 

themselves out. They’re not great to 

Mr. SNOWBARGER. Well 

Mr. Sawyer [continuing]. Begin with. 

Mr. SNOWBARGER. Well, I think I would disagree that 

Mr. Sawyer. Well 

Mr. SNOWBARGER [continuing]. They tend to cancel themselves 
out. That’s quite an assumption. If they’re all inaccurate, to say the 
inaccuracies go both ways, particularly in the size of a State legis- 
lative district which may not 

Mr. Sawyer. Let me 

Mr. SNOWBARGER [continuing]. Be very large at all. 

Mr. Sawyer. Let me suggest, however, that the kinds of inac- 
curacies that result from pure head-counting techniques in 1990 
did, in fact, yield undercounts of some 10 million, double counts of 
some 6 million, and we frequently refer to that as an undercount 
of 4 million; it’s not. 

Mr. SNOWBARGER. Sure, as you suggest 

Mr. Sawyer. It is an aggregate error of 16 million, and those 
kinds of mistakes are important. 

Mr. SNOWBARGER. Right; let me continue on with the local re- 
view. Do you still feel strongly that that part of the process is im- 
portant? 

Mr. Sawyer. Well, I’m not sure that the same kind of local re- 
views used in 1990 is appropriate. But I believe there ought to be 
opportunities for local review. 

Mr. SNOWBARGER. As I understand it right now, the Census Bu- 
reau really hasn’t left enough time to complete the Integrated Cov- 
erage Measurement and still allow for the post-census local review. 
Does that concern you in any way? 

Mr. Sawyer. It does. 

Mr. Snowbarger. Do you have an answer to that? 

Mr. Sawyer. More time. 

Mr. Snowbarger. And, between now and the year 2000? 

Mr. Sawyer. No. 

Mr. Snowbarger. We’ll see what we can do to petition 

Mr. Sawyer. No. 

Mr. Snowbarger [continuing]. The maker of time, but 

Mr. Sawyer. No, in the post — well, that’s part of the problem. 
Part of the problem is that, for about 100 years now, we have 
worked with what are essentially 10-year planning horizons, and 
we wind up in an execution-planning crunch every decade of the 
kind that we’re running into right now. It was one of the fun- 
damental problems that was encountered in the run up to 1990, 
and it’s, I think, the single most important thing that can be taken 
from the kind of testimony that we’ve offered here. Because if we 



28 


allowed those kinds of problems to repeat themselves, we will face 
an even greater problem in 2000 than we did in 1990. 

Mr. Snowbarger. Well, yes, I’m concerned; both of you have 
given examples now of, well, relatively large blocks, depending on 
the type of district 

Mr. Sawyer. Right. 

Mr. Snowbarger [continuing]. You’re putting together, relatively 
large blocks of people that were just left out, whether it was ad- 
ministrative error or whether, you know, whatever the matter. And 
it does concern me that we don’t have any local review process. Do 
you have any thoughts on the — my understanding is, in the new 
census, that there will be the ability of people to check off more 
than one racial block. Are you familiar with that? 

Mr. Sawyer. Intimately. [Laughter.] 

Mr. Snowbarger. OK. [Laughter.] 

If you want to share some of your intimate thoughts, I’d appre- 
ciate it. 

Mr. Sawyer. Well, as you know, in the course of this decade, we 
have seen enormous demographic changes in the make-up of our 
population. And as a result, a significant number of people have 
sought better ways to reflect their personal identity in the way 
they are counted. One of the movements was to create what has 
come to be called a multi-racial block. The difficulty is that it 
makes it extraordinarily difficult to make any kind of comparison, 
from decade to decade, to disaggregate the numbers in ways that 
make it possible to use them in the ways in which they have been 
traditionally applied over the last 30 years, and to track, for a vari- 
ety of purposes, ranging from everything from pure scientific re- 
search, to public health, to everything else — what the information 
that is needed to make sound — public and private — policy. 

To that end, the OMB conducted a series of reviews. Tom and I 
conducted hearings, probably the most thorough hearings ever con- 
ducted on that topic — sometimes, I think, to Tom’s chagrin — 
[laughter] — about how best to approach that dilemma. After a good 
deal of work, OMB, last year, decided that checking more than one 
provided the broadest possible range for people to identify them- 
selves as they understood their own identity, and to make it pos- 
sible to have continuity and comparability in data over the course 
of time in ways that would be most useful for decisionmakers. 

Mr. Snowbarger. Mr. Chairman, I think my time has expired. 
Thank you. 

Mr. Miller. Thank you. Mr. Davis. 

Mr. Davis of Virginia. Just a couple of questions; it sounds like 
some of the advocates at the Census Bureau, and others are basi- 
cally saying, “We’re not going to get a fair count. We just don’t 
know how we can improve the count.” And they’re putting their 
eggs on the sampling basket and trying to make that better. And 
you talked about the information collected, indirectly, that we’ve 
tried to use in the past; could you elaborate on what exactly that 
is? 

Mr. Sawyer. Well 

Mr. Davis. I’m talking about the postmen and the 

Mr. Sawyer. Sure. In the past when it’s been impossible to get 
actual counts from 



29 


Mr. Davis. Impossible, meaning people won’t fill out the forms? 

Mr. Sawyer. Well, first of all 

Mr. Davis. Or, answer the door? 

Mr. Sawyer [continuing]. People didn’t fill out the forms. The 
kind of mail-out/mail-back techniques that are in place today were 
really first put in place after the Second World War, and they, I 
think arguably, have never been an actual enumeration as the 
founders might have conceived of it or as it’s sometimes character- 
ized today. I believe it has been an actual enumeration, mail-out/ 
mail-back. The rates were fairly high to begin with. 

In the course of the last couple of decades, those rates have 
begun to fall, and they fell markedly in the 1990 census. They 
made it very difficult to achieve counts and, particularly, where the 
return rates were down in the 30 to 40 percent range. It meant 
that very large numbers of people had to be sent into very difficult 
areas to count, and they took substantially longer than had been 
anticipated. 

Those problems compound one another. They wound up with 
greater costs, less accuracy; it took longer. And so the disparity in 
time between the actual census date and the completion of the 
count created problems but, in addition, it required that enumera- 
tors going door to door would have to go back three and four times 
ultimately resorting to what is loosely termed “curb stoning.” That 
is to say they first went to last-resort procedures asking postmen 
they may have encountered, or building managers, or people who 
looked like they knew the neighborhood; how many people lived 
there, and what was the make-up of the household? 

Finally, in the end, what it really results in is that a substantial, 
knowable number of households are guessed at. These are not ac- 
tual enumerations. 

Mr. Davis. So it’s based on, per se, gossip? 

Mr. Sawyer. I don’t — those are terms that are not used. 
They’re 

Mr. Davis. But they are, though. You’re asking a neighbor what 
do they see in there, and they 

Mr. Sawyer. And presumably, they gave you the best guess they 
can. But we should understand that those traditional techniques 
involve a substantial amount of that 

Mr. Davis. But you have the 

Mr. Sawyer [continuing]. Kind of guessing. 

Mr. Davis [continuing]. Same thing with sampling, don’t you? 
Don’t you have the 

Mr. Sawyer. I don’t believe so. 

Mr. Davis. In the tighter timeframe for 2000 could make this 
problem worse? 

Mr. Sawyer. Tight timeframes always are a problem. Those who 
complained that the sample was not large enough in 1990, found 
a plan that was proposed as a result of the lessons that were 
learned, that is some five times greater in terms of the actual sam- 
ple. It will be difficult to collect, but it is a — it will yield a far finer 
statistical analysis of the uncounted population than anything that 
was anticipated in 1990. 

Mr. Davis. Why don’t you have the same problems with sam- 
pling? 



30 


Mr. Sawyer. I’m not sure I understand your question. 

Mr. Davis. Well, with sampling you have to, again, you have to 
get an accurate count somewhere and then extend this sample, to 
the uncounted households. Why wouldn’t you have the same kind 
of problems in getting the correct number, ethnicity, and all those 
kinds of issues? 

Mr. Sawyer. You do have those problems, but the use of sam- 
pling in an attempt to refine known areas of error is improved with 
a larger sample; virtually all of us understand that. This is not a 
poll. Polling in this country is grounded in numbers that measure 
the entire Nation in samples of 1,600 to 3,000 

Mr. Davis. Right. 

Mr. Sawyer [continuing]. And if they’re most accurate. This is a 
far larger undertaking. It is vastly more difficult, as you suggest, 
but the effort, I believe, is worth it if we can refine the numbers 
from the known level of error that we’ve encountered in 1990. 

Mr. Davis. But it seems that the errors would be magnified by 
a shorter timeframe. 

Mr. Sawyer. The errors become magnified by having too small 
a sample. The ability to have the largest possible sample is quite 
important. But, as I suggested, if you’re suggesting to me that 
there is difficulty in recruiting sufficient numbers of people, train- 
ing them well enough, getting them to the household, and getting 
the counts done, you’re absolutely correct. But, if you mean to sug- 
gest that by attempting to do sampling, that it makes the matter 
of a bad count to start with worse, I think you’re incorrect. I think 
the opportunity to refine that count is far improved when you use 
the kind of techniques that are available to the Nation today. 

Let me just say one other thing; this is not the first time that 
new techniques have been used in the census. The mail-out/mail- 
back was a substantial departure from what had been done in the 
past, and it improved the count over what would have been pos- 
sible today if we were still trying to do everything sending out peo- 
ple to go door to door. We just simply wouldn’t be able to do it. 

The same thing happened in the 1880’s, when we weren’t able 
to tabulate the census results. It took 8 years to tabulate the 1880 
census, and so it was in 1890, that the use of punch cards and ma- 
chine counting to tabulate the census was, for the first time, used. 
Now, that was not handwork either, but it resulted in a substan- 
tially improved count and a much more usable data because it was 
usable throughout the entire decade. That’s where IBM came from. 
We just have always been a nation of innovators, and I think we 
have the opportunity and a compelling case to be made for innova- 
tion in the 2000 census. 

I genuinely believe that if we attempt to make substantial 
changes in the plan that has been evolving over the entire course 
of this decade, if we attempt to make major changes in those plans 
at this late date, that we will exacerbate a problem that is already 
difficult. 

Mr. Miller. Thank you. Let me thank you both for being here. 



31 


Mrs. Maloney. But, Mr. Chairman, if I could, please. I have a 
series of additional questions, but in the interest of time — because 
I know we have many other panels — I would like your permission 
to have both of our colleagues respond, in writing, to my questions 
and have them part of the permanent record. 

Mr. Miller. Without objection. 

[The information referred to follows:] 



32 


W aunio .. aOMMA 


HOW A. WAXMAH. CAUFOAMA 


chnstophebshay* otmcticut 


□n.CALVOMMM 

a^aha aom^sktmn fiomw 

jot* M. MCWIBH. KWWWI 


JOE SCAABOBOOOH. PLOWOA 


STEVE C lATOOSmrt OHO 
mabshall-mank-banfomo. south 
X»W E. SUMMU. NCVI HAMTCMmE 
PETE SESSIONS. TEXAS 
MKE PAPPAS, NEW JEBSEY 
VWCE SNOWSAMQCA. KANSAS 


MOSPONTMAH. OHO 


ONE HUNDRED FIFTH CONGRESS 

Congress of tfje ®mteb States 

^ouse o( fcepreaentatfoe* 

COMMITTEE ON GOVERNMENT REFORM AND OVERSIGHT 
2157 Rayburn House Office Building 
Washington, DC 20515-6143 



TOW LAMTO& CAUFORNU 

K» WISE. NOT VMMMA 
HUM A OHM NEW VOMK 
EDCEPHUS TOWNS. NEW VC*W 
PAW E. KANJOMXI. PDMSnVAMA 
OAAT A OONOfT. CAUPOAMA 
CMOLYMIL MALONEY. NEW Y0»( 

DttWS SAAA 1 TT . WWC O »— I 

ELEANOMNOLMS NORTON. 

OCTMKTOPCOLUMBM 
CNAKA PATTAK PENNSYLVANM 

miW t. C U MA HOD . MARYLAND 

OEHMSKUCMCH.OMO 

MOO A SLAOOJEV1CH AUMCX6 

OAMfV K OAVW. EUNOtS 

JOHN P. TWMY. MASSACHUSETTS 

JMTUHNER. TEXAS 

THOMAS H. AUEN. MAM 

HAAOLO E POAD. J*. TENNESSEE 


May 19, 1998 


The Honorable Thomas C. Sawyer 
U.S. House of Representatives 
1414 Longworth House Office Building 
Washington, D.C. 205 1 5-35 1 4 

Dear Mr. Sawyer, 

Thank you for testifying before the Government Reform and Oversight 
Subcommittee on the Census on May 5, 1 998. Because of time constraints, I was left 


e 


e 

ses 




33 


Do you feel that the use of a coverage measurement methodology (PES/1CM) which is 
not more accurate and closer to the truth in geographic areas containing populations of 
less than 100,000 persons (which covers most block groups, census tracts, townships and 
towns) would be a good public policy choice? 

If Secretary Mosbacher had used the adjusted figures in 1991, how would you have 
responded to the state of Pennsylvania after the processing error was found in 1992? 

(Pennsylvania would have lost a Congressional seat to Arizona erroneously from 
the June 1991 PES adjusted counts) 

If Secretary Mosbacher had adjusted the 1990 Decennial Census, in your opinion, would 
the lawsuits have ceased from one group of cities and increased from another? 

All three of our witnesses on the second panel testified during the hearing about the 
problem of correlation bias in the adjusted counts and how it inflates the undercount. 
They testified that the Census Bureau’s own studies conclude that more than half of the 
undercount estimates were not true undercount but correlation bias. Would it bother you 
if the ICM proposed sampling adjustment plan to be used in 2000 will have the same 
problems? 

According to the National Academy of Sciences half of the undercount in 1 990 was 
because people never received a census form, not because they received one and did not 
send it back. Would it then follow that a complete Master Address File would have 
resulted in the best census in history? 

My questions and answers will be part of the permanent record of the May 5, 1 998 
hearing. Again thank you for input into this important process. 



Dan Miller 
Chairman 

Subcommittee on the Census 


CC: Rep. Carolyn Maloney 



34 


THOMAS C SAWYER 

14TH DISTRICT 
OHIO 


Congress! ot tij e ©niteb States 

$)ouse of fttpresentatibes 

8©atftjmffton, 20515-3514 


COMMERCE 

SU6C0MIWTT1ES 
TELECOMMUNICATIONS. TRADE. 
AND CONSUMER PROTECTION 
FINANCE AND HAZARDOUS 
MATERIALS 
OVERSIGHT AND 
INVESTIGATIONS 


June 18, 1998 


The Honorable Dan Miller 

Chairman, Subcommittee on the Census 

Committee on Government Reform and Oversight 

U.S. House of Representatives 

114 O'Neill Building 

Washington, D.C. 20515 

Dear Chairman Miller: 

Thank you for the opportunity to testify before 
the Government Reform and Oversight Subcommittee on the 
Census on May 5, 1998. I have enclosed answers to 
your additional questions. I hope you find them 
helpful. 

Please let me know if I can be of further 
assistance. 



TCS/djm 

Enclosure 


260 S Chestnut Street 
Ravenna, OH 44266-3031 
13301 296-9010 


1414 LONGWORTH HOUSE OFFICE BUILDING 

Washington. DC 20515-3514 
1202) 225-5231 


411 Wolf Ledges Parkway 
Suite 105 

Akron. OH 44311-1106 
(330) 375-6710 
TOO: (330) 376-5443 


THIS STATIONARY PRINTS 0 ON PAPER MADE Of RECYCLED FIBERS 



35 


Response to Written Questions by Chairman Dan Miller 
from Congressman Tom Sawyer 


Subcommittee on the Census 
Committee on Government Reform and Oversight 
Hearing on May 5, 1998 


1 . During the Subcommitteee hearing on May S, 1998 you mentioned that block 
level data are not even used for redistricting. Are you maintaining that block 
level data are not extensively used for legislative and local redistricting? On 
further reflection would you wish to revise that comment with regard to 
congressional redistricting? 

While 1 appreciate the opportunity to revise my remarks with regard to redistricting, a 
clarification of my comments would perhaps be more helpful to the subcommittee. As 
you know, legislative districts are created for geographic areas much larger than a census 
block. While census blocks are aggregated to form districts of varying size, depending on 
the political body (i.e. congressional, state legislative, school board, etc.), it is really the 
size (in terms of population) and composition of the entire area that is of concern both to 
those who are charged with drawing the boundaries and courts that must determine if 
those districts meet the test of equal representation. 

Let’s take the case of a congressional district as an example. When a state legislature or a 
court evaluates districts within a state, they must ensure that the population of each 
district is as equal as possible. If a state is subject to monitoring under the Voting Rights 
Act, it must also show that the racial composition of districts meets certain requirements. 
Therefore, it is important that census counts be as accurate as possible at the 
congressional district level, which is an aggregate of census blocks and tracts, so that 
population size can be compared. A court would not be interested in the population size 
of a block, or even a tract, within each district. 

State legislatures might closely study data from census blocks that form the perimeter of 
districts, to make the fine distinctions that need to be made in allocating population to one 
district or another, but their goal is to create entire districts that are as equal as possible. 
Population numbers produced by a census that combines traditional counting methods 
with modem statistical sampling will be more accurate for areas the size of a 
congressional district, as well as for many smaller areas that have had the highest 
undercounts in the past. Traditional counting methods alone are likely to produce 
numbers that are far less accurate at the congressional district level. Therefore, while 
those districts may appear to be equal in size, they in fact won’t be at all. 

It is most important to remember that even though census figures produced through a 
combination of direct counting and sampling are not perfect at the smaller geographic 
levels, neither are the figures produced by older counting methods alone. In fact, those 



36 


latter figures are highly flawed at the block level, and even more flawed as one 
aggregates to larger areas, making the equality of legislative districts a myth, at best. 

2. Would you agree that census tract, block group, and township level data are 
widely used in redistricting? 

Census tract and block group data are used in aggregation in the redistricting process, 
helping those who draw the lines to put together political units that are equal in size 
numerically and meet certain tests for demographic composition. That is why it is 
important to produce the census numbers that maintain their accuracy for larger 
geographic areas. 

Older counting methods , such as those used in 1990, produce high levels of error that do 
not diminish as smaller geographic units are aggregated to form larger, useful units of 
governance. With regard to townships, your question is unclear because townships are 
political units that are created on the basis of aggregating smaller geographic units. 

3. Do you feel that the use of a coverage measurement methodology (PES/ICM) 
which is not more accurate and closer to the truth in geographic areas 
containing populations of less than 100,000 persons (which covers most block 
groups, census tracts, townships and towns) would be a good public policy 
choice? 

1 do not agree with the premise of your question that the PES/ICM methodology planned 
for the 2000 census will produce less accurate population figures for areas smaller than 
100,000 in population than a census that relies only on traditional counting methods. 
Census Bureau evaluations showed that block level data in the 1 990 census had an 
average error rate of eight percent. Those non-sampling errors were not reduced as the 
data was aggregated to higher levels. The worst (and least defensible) public policy 
choice would be to require a census design that is likely, by all accounts, to result in an 
undercount that is as large, or larger , than in 1990. 

4. If Secretary Mosbacher had used the adjusted census figures in 1991, how would 
you have responded to the state of Pennsylvania after the processing error was 
found in 1992? 

1 am confident that if Secretary Mosbacher had decided to adjust the 1 990 census counts 
based on the results of the Post Enumeration Survey, the figures would have been 
scrutinized much more closely before they became official, and the processing error (not 
an error in the methodology, by the way) would have been discovered. However, your 
question suggests that because there was a processing error in 1 991 , the methodology 
proposed for 2000 will not work. To the contrary, it was the Census Bureau that 
discovered and fixed the error (rather than hiding it), thus helping them to develop 
improvements in methodology and operations as the began to plan for the next census. 



37 


5. If Secretary Mosbacher had used the adjusted the 1990 Decennial Census, in 
your opinion, would the lawsuits have ceased from one group of cities and 
increased from another? 

1 can’t give you an answer to this question that goes beyond mere speculation. Suffice it 
to say that history has shown that the census has always been the subject of much 
litigation, from the way people are counted, to where they are counted, to who is counted. 
It is hard to imagine that any decision, one way or another, on census design will either 
stem or increase the flow of litigation. As long as the census is the foundation of political 
representation and the allocation of fiscal resources, someone is bound to be dissatisfied 
with the result and seek a remedy through the courts. 

6. All three of our witnesses on the second panel testified during the hearing about 
the problem of correlation bias in the adjusted counts and how it inflates the 
undercount. They testified that the Census Bureau’s own studies conclude that 
more than half of the undercount estimates were not true undercount but 
correlation bias. Would it bother you if the ICM proposed sampling adjustment 
plan to be used in 2000 will have the same problems? 

In order to provide a useful answer to your question, it may help to clarify what 
“correlation bias” is. Contrary to the assertion in your question, correlation bias resulting 
from the dual system methodology used to measure census coverage understates , not 
“inflates,” the undercount. Let me explain why. The DSE method clearly measures three 
situations: people who were counted in the initial phase of the census but not the post 
enumeration survey; people counted in the post-census survey but not the initial phase; 
and people who were counted in both phases. Those “cells” are easy to understand. It is 
the so-called “fourth cell” - people who are missed both in the initial census count and in 
the post-census survey - that creates correlation bias; that is, error related to the inability 
to capture some of the universe you are trying to count no matter which method is used. 
So to the extent there is correlation bias which cannot be corrected or reduced using 
known statistical assumptions, the Census Bureau underestimates - not overstates -- the 
size of the undercount. 

Given that the presence of correlation bias causes the methodology to understate the 
number of people missed in the census, 1 continue to believe that we are better served by 
a census that gets us much closer to a true, if not perfect, count of the population - in 
terms of composition and geographic location - than a census that we know will once 
again miss millions of Americans. 

7. According to the National Academy of Sciences, half of the undercount in 1990 
was because people never received a census form, not because they received one 
and did not send it back. Would it then follow that a complete Master Address 
File would have resulted in the best census in history? 



38 


First, I believe that the figure concerning the portion of nonresponse attributable to within 
household versus whole household misses is incorrect. 1 am aware that the National 
Academy of Sciences report referred to a 50-50 split in the types of misses; however, 
NAS panel members have since indicated that the reference in their report was an 
unintentional mistake. In fact, according to the General Accounting Office and the 
Census Bureau, about two-thirds of the people missed in 1990 lived in households that 
were counted (within household misses), while one-third of those missed lived in housing 
units that were not counted. 

That ratio was an improvement over 1980, when it was the 50-50 split to which you 
referred in your question. While this information clearly indicates that the Bureau 
improved the accuracy of its address lists in preparing for the 1990 census, it is also clear 
that there is room for greater improvement in this area. Nevertheless, the evaluations 
demonstrate that even the most comprehensive address file will not produce “the best 
census in history.” In fact, the trend appears to indicate that the undercount is becoming 
more systemic, resulting from factors such as transient living arrangements, distrust of 
government, and other social causes that cannot be overcome with better address lists. 
And even the best efforts to develop a complete address file will still miss some 
nontraditional housing units, where people who tend to be missed are more likely to live. 



39 



BENJAVm A OUlU. NEW YORK 

J DENNIS HASTIRT.IUNOIE 

CONSTANCE A MORELIA MARYLAND 

CHRISTOPHER SHAYS CONNECTICUT 

8TEYEN SCMIFF . NEW ME XJCO 

CHRISTOPHER COX, CALIFORNIA 

IE ANA ROS-LCHTMEN. ELOfMOA 

JOHN M McHUOH. NEW YORK 

STEPVEN HORN, CALIFORNIA 

JOHN l MICA. FLORIDA 

THOMAS M OAVIS K. VMOWUl 

OAVIOM MlMTOSH INDIANA 

MARA E SOUOER INCAANA 

JOE SCAPBORODOH FLORIDA 

JOHN SHADE OO ARIZONA 

STEVE C UtOURtTTE OHIO 

MARSHALL 'MARK' SANFORD. SOUTH CAROLINA 

JOHNS SUNUNU. NEW HAMPSHIRE 

PETE SESSIONS. TEXAS 

MWE PAPPAS NEW JERSET 

VINCE SNOWBAMOER. KANSAS 

SOP BARB GEORGIA 

ROB PORTMAN OHIO 


ONE HUNDRED FIFTH CONGRESS 

Congress of tfje fHniteb States 

EiauSf of Eeprcsentatibes 

COMMITTEE ON GOVERNMENT REFORM AND OVERSIGHT 
2157 Rayburn House Office Building 
Washington, DC 20515-6143 



June 2, 1998 


Representative Thomas E. Petri 
U.S. House of Representatives 
Washington, DC 20515 

Dear Tom, 

Thank you for testifying before the Government Reform and Oversight Subcommittee on 
the Census on May 5, 1998. Because of time constraints, I was left with a number of questions 
unanswered. Therefore, I request that you answer the following questions: 

1 . You have mentioned that your biggest concern is with using estimation techniques 
at the smaller geographic units. What is your opposition to using estimation at the 
state level, for reapportionment or funding formulas? 

2. Your were very supportive of efforts by the Bureau to work, with the Postal 
Service to create the comprehensive master address file and to work with State, 
local and tribal governments. We have found that the Postal Service lists were 
insufficient and local governments were not able to handle this task. Many local 
governments have said that they cannot help the Census Bureau develop its 
address list unless the federal government provides funds for that effort. Would 
you support funding local governments to assist in the address list development? 

3. With regard to promotion and outreach, you have mentioned the Milwaukee 
example in the past as a possible prototype. My understanding of the Milwaukee 
example is that there were great efforts, time and cost, by local governments to 
increase the mail return rate. However, the undercount in Milwaukee was well 
above the national average, and nearly 4 times the undercount in the state. This 
suggest that the Milwaukee example is useful for increasing the mail return rate, 
but not for reducing the undercount Do you have any suggestions on how the 
Bureau could reduce the differential undercount of minorities and the poor. Is it 
just a matter of spending more on traditional counting measures? If that is the 
case, why did that not work in 1 990? 


40 


My questions and your answers will be part of the permanent record of the May S, 1998, 
hearing. Again, thank you for your input into this most important process. 

Sincerely, 


Carolyn B. Maloney 
Ranking Minority Member 
Subcommittee on the Census 


cc: Rep. Dan Miller 



41 


THOMAS E. PETRI 2262 Rayburn House Office Bus. DING 

6th District. Wisconsin 

Congress of tfjc ©nittb States 

J)ousr of Bepresentattoes 

<ftlafrt)ington, JBC 20515-4906 
July 23, 1998 


Washington. DC 20515-4906 
1202) 225-2476 


6th District Offices: 

1414)922-1100 

Oshkosh, Wl 
1414) 231-6333 


The Honorable Carolyn B. Maloney 
Ranking Minority Member 
Subcommittee on the Census 
Washington, D.C. 20515 

Dear Carolyn: 

1 appreciated the opportunity to testify before the Government Reform and Oversight 
Subcommittee on the Census on May 5, 1998. I am sorcy I did not have time to answer all 
of your questions about the drawbacks to the use of statistical sampling, and the effective 
measures employed by the state of Wisconsin to produce an accurate enumeration in the 
1990 Census. My answers to your three questions follow. 

1 . Your question here properly frames the issue at hand. The methodology the 
Bureau is proposing for use in the 2000 Census would use the integrated coverage 
measurement survey (ICM) to calculate adjustment estimate for subgroups of the population 
within each state. Since the census count is built from the bottom up, it is important that 
these estimates be accurate from the smallest units of government to die largest. If we 
cannot guarantee and acceptable level of accuracy in governmental jurisdictions of less than 
100,000 persons, we are building our statewide estimates on a foundation of sand. The 
figures may, therefore, be no better than the inaccurate local estimates. 

Our experience in reviewing the 1990 Post Enumeration Survey demonstrated that an 
"adjusted" count would have been closer to the real population in some states and farther 
from the real population in other states. I am not sure how "adjustment" would result in 
state totals that are better than the actual enumeration for purposes of reapportionment of 
House seats, since gaining or losing a seat can hinge on extremely small differences in 
population. Since the adjustment would have resulted in an incorrect apportionment of the 
House in 1991, how can we be sure that the same methodology will not produce 
apportionment errors in 2001? Since the Bureau proposes to rely on statistically produced 
numbers for 12 to 13 percent of the population in the 2000 Census - as opposed to less than 
two percent in the 1990 Census - 1 am wary of using these numbers for reapportionment. 
This is particularly true since there will be inadequate time to examine the accuracy of the 
ICM before Congress has to accept the numbers for reapportionment use. 

Since taxation is tied to representation, I would have difficulty using two sets of 
numbers - one for representation and the other for allocation of federal and state funding. 

2. Since the Census Bureau, as a result of problems identified in the Dress Rehearsal, 
now proposes to perform a 100% canvass of all blocks in the United States, we should 
reexamine the value of the LUCA program as it is now designed. We need to determine 
how much money would be required to have a meaningful impact. I suggest this would be 

a question better examined by your subcommittee. I am more concerned with the fact that 
the post enumeration local review program has been deleted to make more time available to 
the ICM. Local governments should have a chance to examine and challenge the counts 



The Honorable Carolyn B, Maloney 
July 23, 1998 
Page 2 


42 


before they are finalized. 

3. It is not at all clear that the estimates of the undercount in both Wisconsin and 
Milwaukee were entirely accurate. The fact that these estimates were based more on 
observations taken outside the state than any observations within the state calls into question 
their accuracy in measuring the true success of the Milwaukee example. The results simply 
are not conclusive. 

The radical departure proposed for the 2000 Census is the deliberate use of imputation 
in place of non-response follow-up. Because of increased dependence on both imputation 
and statistical inference, the 2000 Census could contain as many as 12 to 13 percent 
manufactured persons. If the statisticians have their way, these percentages could go even 
higher in future censuses The problem in using estimation is that we would be replacing 
one set of errors with another set some people like better. It also appears that the Bureau is 
depending more on sampling and less on enumeration in order to devote more time and 
resources to an increasingly complex sampling methodology. 

My fear for the future is that if people learn that the census count in 2000 contained 
13% or more "virtual" people, they will have even less motivation to participate. This may 
be especially true for people who have, in the past, taken time to fill out their own 
questionnaires. We could be engaging in an self-fulfilling prophecy and turning a census 
that counts 98% or more of the people into one that only "counts" 87% in 2000 and goes 
down hill from there each successive decade. 

One suggestion I do have is that the Bureau do a better job in outreach than in 1990 
and that they build a better address file. The National Academy stated that over 30 percent 
of uncounted persons were missed because their households did not receive questionnaires. 
Improving that one process alone would have made the 1990 Census the "best in history" - 
even by your measurements. 

I hope my comments have helped to clarity the dangers of estimating the 2000 
Census. If you have any further questions about the 1990 Post Enumeration Survey or the 
Wisconsin experience, please to not hesitate to ask. 



Thomas E. Petri 
Member of Congress 


TEP:pjp 









44 


My questions and your answers will be part of the permanent record of the May 5, 1 998, 
hearing. In addition, I recall that during the hearing there were some comments made by 
Representative Petri that you wanted to address. Please include such remarks with the answers to 
these questions. Again, thank you for your input into this most important process. 

Sincerely, 


Carolyn B. Maloney 
Ranking Minority Member 
Subcommittee on the Census 

cc: Rep. Dan Miller 



45 


THOMAS C SAeWER 

MTM 0* STRICT 
OHiO 

Congress of tfje ®nitei States 
J^oust of ftepreSentatibrt 
$Hasrt)tngton, ©C 20515-3514 


July 13 , 1998 


The Honorable Carolyn Maloney 

Ranking Member, Subcommittee on the Census 

Committee on Government Reform and Oversight 

U.S. House of Representatives 

511 Ford House Office Building 

Washington, D.C. 20515 

Dear Mrs. 

Thank you for the opportunity to testify before the 
Government Reform and Oversight Subcommittee on the Census 
on 

May 5 , 1998. I have enclosed the answers to your questions. 
Please let me know if I can be of further assistance. 

Thank you for your leadership on this important issue. 


Sincerely, 



Thomas C. Sawyer 
Member of Congress 

TCS/djm 


cc: Chairman Dan Miller 

Subcommittee on the Census 

Committee on Government Reform and Oversight 


commerce 

HMCOMMrTTfU. 

TI UCO .— T«x nun, 
uto com&umm •moncnoN 

flfUSCI AMONAZAAOOUt 

«*TE*iALt 
OVERSIGHT AMO 
•f/MTlGATIOMS 


:w s. s !*.'«• 

•tavINXA. OH 44266-3031 
13301 296-9610 


1414 Usm:*’* m:.si O** ci Bu'.o-.no 
nVOv DC 20919-3914 
20 : 229-9231 


411 *V;.« loot* 

Sv*4 109 

*■« OH *4311-1109 
330 3T9-9710 
330' 319-9443 


r~ S 5 T ATI0\EPV MMEr 09 MADE Of "ECVCLED FIBERS 



46 


Response to Written Questions by Ranking Member Carolyn Maloney 
from Congressman Tom Sawyer 

Subcommittee on the Census 
Committee on Government Reform and Oversight 
Hearing on May 5, 1998 


1. What is your response to the charge that reducing the historic differential 
undercount through sampling techniques might somehow be subject to manipulation 
for political benefit? 

A. Concerns about the manipulation of statistical techniques to change the 
census results for political advantage have no basis in fact, history, and 
science. Sampling and statistical techniques have been used in the census in 
varying ways since 1940, sometimes adding hundreds of thousands of people to 
the census counts and causing a congressional seat to shift from Indiana to 
Florida following the 1980 census. Yet there simply is no evidence that any 
Administration or any Congress sought to interfere in the design or 
implementation of these methods to direct a certain outcome. The design and 
execution of sampling (and all census operations, for that matter) are complex 
scientific undertakings that require the involvement of experienced and 
knowledgeable scientists, including statisticians, demographers, and 
mathematicians. It would be very difficult, if not impossible, for political 
appointees to direct changes in methodology to achieve a certain outcome. 

Furthermore, the Census Bureau has developed its census plan in consultation 
with some of the nation's premier scientists at the National Academy of 
Sciences and professional scientific associations, as well as expens in 
government operations from the Commerce Department’s Office of the Inspector 
General and the General Accounting Office. All of these independent bodies 
have continued to closely monitor development of census methods and 
operations. Any effort to modify techniques to gain political advantage would 
be easily detected by the Bureau's many outside observers. And finally, 
charges that political staff at the Commerce Department or even the White 
House would somehow change the census numbers before they become final are a 
direct attack on the integrity of career professional employees at the Census 
Bureau who plan, prepare for. and implement the nation’s largest peacetime 
activity. Such charges imply that these Bureau employees would 'look the 
other way' if anyone outside of the Bureau attempted to interfere with the 
objective design and implementation of census methods and procedures. I am 
saddened by such charges and believe they are irresponsible and w ithout any 
merit whatsoever. 

2. Is there any evidence that any partisan influence or manipulations occurred 
at the Census Bureau during any stage of the 1990 census? 



47 


A. There is absolutely no evidence to suggest that the Bush Administration or 
any officials in the Administration exerted any influence over the choice, 
design, or implementation of census methods to affect the outcome of the 
count. Despite the decision by then-Secretary of Commerce Robert Mosbacher 
not to use the results of the Post Enumeration Survey to correct undercounts 
and overcounts in the census (a decision with which I disagreed), there is no 
evidence that the Commerce Department or the White House attempted to 
manipulate the conduct of the census for political benefit. 

3. Will Congressional action or inaction jeopardize the success of the census? 
What is the effect of the timing of the Congressional decisions? 

A. It is important for Congress to conduct oversight of the census in a timely 
and constructive manner. Even with an entire decade between censuses, the 
Census Bureau must follow a rigid schedule to conduct research on census 
methods and procedures, test components of a potential design, develop a plan, 
prepare for the census, evaluate a final plan in a census-like environment 
(the Dress Rehearsal), deploy a complex field structure, solicit local 
support, and execute the census in a nine-month period. Slippage in any of 
these key milestones can place subsequent operations at risk, since not enough 
time may be left for thorough completion of each stage. 

It is very unfortunate, in my opinion, that Congress has not yet given the 
Census Bureau a green light to proceed with its plan. Time and resources that 
could be better spent completing key operations, such as address list 
development, is instead diverted to continued evaluations of fundamental 
design components such as sampling. 

Early in the decade. Members of Congress from both sides of the aisle decided 
that a zero-based review of census methods was needed. My subcommittee 
considered legislation sponsored by Rep. Thomas Ridge (R-PA) and Rep. Harold 
Rogers (R-KY) to require such a review by a National Academy of Sciences 
panel; Congress ultimately passed the Ridge bill without any dissent. Dr. 

Barbara Everitt Bryant, director of the Bureau under President Bush, set in 
motion a detailed research agenda designed to evaluate many new methods and 
operations. In subsequent years, the General Accounting Office, the Commerce 
Inspector General, and Congress itself continued to press the Census Bureau to 
adopt new methods that would help improve accuracy, reduce the persistent 
differential undercount, and contain costs. The Bureau was well on its way 
toward meeting those goals when a new Congress raised concerns about 
components of the census plan without conducting any thorough oversight or 
hearings to assess fully the plan's soundness. The change of heart against 
new census methods came late in the planning process and the subsequent delay 
in finalizing a census design has certainly placed the 2000 census at risk. 



48 


despite the continued dedicated effort of the Bureau's career workforce. 

4. What is the effect of the Bureau acting without a permanent director? 

A permanent director, nominated by the President and confirmed by the U.S. 

Senate, would bring the level of leadership and authority necessary to build 
confidence in the census process among key stakeholders that include the 
Bureau's career and temporary employees, local and state officials, civic 
leaders, the public, and Congress itself. In making that observation I do not 
mean to suggest that the Bureau's competent and dedicated staff are not 
capable of preparing for and conducting a census, nor do I mean to suggest 
that the Bureau's senior officials do not exercise leadership or make wise 
decisions. Rather it is simply the nature of a Presidential appointment that 
lends a greater level of authority and attracts a higher level of confidence 
necessary to lead a large and complex organization through a very difficult 
and closely-watched national undertaking. A presidential appointment and 
Senate confirmation imply that the selected individual has the confidence of 
both the President and the Congress to do the job for which he or she is 
chosen competently and fairly. A vacancy in the Bureau’s highest position, 
whether due to a failure to nominate or a failure to confirm, does suggest 
that either the President or Congress do not place a high enough priority in 
the Bureau's work to warrant competent leadership. In this case, the 
President has nominated a well-respected social scientist whose qualifications 
and experience are very similar to all other Census Bureau directors to come 
before him, including those of President Bush’s director, Dr. Barbara Bryant. 

It is now incumbent on the Senate to determine quickly whether the nominee is 
qualified to direct the many important activities of the Census Bureau, 
including the decennial census, and to confirm the nomination absent any 
glaring evidence that the nominee is unqualified. Failure to act before 
Congress adjourns in the Fall would leave the Bureau in a more precarious 
position and. frankly, make it even more necessary for Commerce Department 
officials to assist with census preparations, an outcome that the Bureau's 
critics consistently decry. 

5. There are several challenges which have made census taking more difficult 
over time such as escalating costs, declining levels of public cooperation. 

and the shrinking temporary workforce. Other than sampling, what efforts can 
be made to have a more accurate census? 

A. It is important to remember that sampling is not an end in itself but 
simply one means to an end: a more accurate census that eliminates, the 
greatest extent possible, the persistent disproportionate undercount of the 
rural and urban poor and people of color. Sampling and various statistical 
techniques have been used in the census since 1940. adding many people to the 
count in an effort to improve coverage. For 2000, the Census Bureau has 



49 


followed the recommendations of independent and respected scientists and 
government operations specialists in adding new uses of sampling to the census 
design. These new counting methods are aimed at containing costs, improving 
timeliness, and reducing the differential undercount. But they do not 
guarantee a successful census on their own, just as other elements of the 
census plan cannot achieve all of these important goals on their own. 

As many outside experts, including the National Academy of Sciences and the 
General Accounting Office, have suggested, the Census Bureau must improve its 
address list development effort to ensure a more thorough foundation for a 
mail-based census. Congress passed legislation early in the decade to 
facilitate wider access to address lists from other Federal and local 
government agencies. Initial hopes that Postal Service and local government 
address lists would contribute substantially to an improved address list have 
been dampened somewhat bv reality, but the Bureau must move forward quickly to 
complete address list development using the most reliable methods possible. 

Redesigned questionnaires that are easy to understand and fill out also may 
encourage more people to respond. However, even simpler forms may not be 
enough to overcome barriers to response such as illiteracy or language. More 
pervasive advertising also is important, as evaluations of previous censuses 
showed that people who are aware of the census are more likely to respond. 

Hiring enumerators indigenous to the neighborhoods they are canvassing is 
necessary to build some level of trust and encourage response. 

Most importantly, perhaps, leaders at all levels of government and in the 
private sector must make an extraordinary effort to build confidence in the 
census process among their constituencies. From Members of Congress and state 
and local elected officials, to religious, civic, business and labor, and 
neighborhood leaders, everyone in a position of influence has an obligation to 
talk about the census in positive terms. If leaders in influential positions 
by virtue of their access to the media continue to question the integrity of 
the process and, by implication, those who carry it out. we risk a failed 
census in every community across the country, not just those that 
traditionally are harder to count. 



50 


Mr. Miller. Any Member that wishes to submit additional ques- 
tions, if you all don’t mind responding, we’d appreciate it. 

Mr. Petri. We’ll try. [Laughter.] 

Mrs. Maloney. And also, on behalf of my colleague, Mr. Sawyer, 
who wished to put forth additional information, may I request for 
Mr. Petri and Mr. Sawyer if they have additional information for 
the record, that it be made part of the record? 

Mr. Miller. Without objection. 

Mr. Petri. Thank you. 

Mrs. Maloney. OK, thank you. 

Mr. Miller. And I will also give you the chance to polish this 
if you want before it goes into the official record. [Laughter.] 

Mrs. Maloney. Thank you both for your work and your testi- 
mony today. 

Mr. MILLER. Thank you very much, and I’m sure we’ll be work- 
ing with you a lot more over the next months. 

We will move right on into our next panel of witnesses. If they 
would come forward and gather and have a seat, please. 

If you’ll stand, we have to swear you in to the committee, if you 
would. Just raise your right hands. 

[Witnesses sworn.] 

Mr. Miller. Thank you, please be seated. Thank you very much 
for being with us here today. What I would like to do is have each 
of you make an opening statement; your official statement will go 
in the record, of course. And when you start, if you would just in- 
troduce yourself as to your background and why you’ve been asked 
to appear here today, it would be appreciated. 

And we’d like to start with — Dr. Stark — first, please. Dr. Stark. 

STATEMENTS OF PHILIP STARK, PROFESSOR OF STATISTICS, 

UNIVERSITY OF CALIFORNIA, BERKELEY; KENNETH DARGA, 

PH.D., DEMOGRAPHER, DEPARTMENT OF MANAGEMENT 

AND BUDGET, STATE OF MICHIGAN; AND JERRY COFFEY, 

PH.D., MATHEMATICAL STATISTICIAN 

Mr. Stark. Thank you, Chairman Miller. My name is Philip 
Stark. I’m a professor of statistics at the University of California, 
in Berkeley, where I’ve been on the faculty for about 10 years. I 
have particular interest in problems that involve very large data 
sets with complex data acquisition and in which one’s trying to es- 
timate a lot of unknown quantities and, also, large computational 
problems. 

Mr. Miller. If you’d like to, I think we’ll just — you go ahead and 
make your statement, and then we’ll proceed next with Mr. Darga 
and then with Mr. Coffey. 

Mr. Stark. Thank you. Thank you very much, Chairman Miller, 
and other members of the committee for inviting me to speak about 
the census. 

We know from experience that, overall, the census misses some 
people. The undercount is different in different places which leads 
to errors and State population shares. As we’ve already heard 
today for many purposes, including distributing Federal funds and 
congressional representation, State shares matter more than the 
total U.S. population. For that reason, I’ll focus on the accuracy of 
State shares. 



51 


It would be wonderful to know how many people the census 
missed, and where. Then we could add them where they belong. 
That would adjust for the undercount and improve State shares, 
but we don’t know. The missing people weren’t counted. 

Sampling is selecting part of a population to represent the whole. 
The Census Bureau used sampling to estimate the 1990 decennial 
census undercount so they could adjust for it. The official 1990 cen- 
sus numbers were not adjusted. For the 2000 decennial census, 
there are two proposals — the proposal involves using sampling in 
two ways. First of all, to adjust for the undercount, and second, to 
followup some people who don’t mail back their census forms. I’m 
only going to talk about using sampling to adjust for undercount- 
ing. 

The 1990 and 2000 adjustments have different names and dif- 
ferent details, but they’re based on the same statistical methods, 
so, much of what I say about the 1990 adjustment applies to 2000 
as well. Would the adjustments have improved the 1990 census? 
Probably not, because of statistical bias. Adjustment has two kinds 
of error: sampling error, which comes from the luck of the draw, 
the blocks that happen to be in the sample; and systematic error, 
or bias, which comes from bad data, processing errors, and wrong 
assumptions, among other things. Bias is a technical term; it 
doesn’t mean someone is intentionally skewing the results. Sam- 
pling errors tend to average out; bias does not. Making estimates 
from a sample is like shooting a rifle. Each shot hits the target in 
a different place. Sampling error is the scatter in the shots; bias 
is a tendency for all the shots to be off in the same direction, for 
example, to the left. You fix bias in a rifle by sighting it in. That’s 
straightforward because you can see where the shots land. Fixing 
statistical bias in a census adjustment is hard. You only get one 
shot because you only take one sample, and you can’t see where the 
shot lands because you don’t know the true undercount. 

The 1990 adjustment process was extremely complex, so it’s very 
hard to track down all its biases. For example, months after cal- 
culating the adjustment, the Census Bureau found that a coding 
error had inflated the undercount estimate by a million people, 
about 20 percent of the adjustment; that’s bias. Studies show that 
40 percent to more than 80 percent of the 1990 adjustment is bias. 

Adjustment could easily make the census worse instead of better. 
New York, Pennsylvania, and Illinois lose shares in the adjust- 
ment. Texas and Arizona gain shares. Arguably, it’s easier to count 
people in Dallas and Phoenix, for example, than in the Bronx, 
Philadelphia, and Chicago, where the inner cities are denser. Tak- 
ing shares away from New York, Pennsylvania, and Illinois might 
be right, or it might be bias from bad assumptions. 

Some claim that adjustment would have made the 1990 census 
more accurate. Their technical arguments depend on statistical 
models. The models are false, and they have bizarre consequences. 
For example, the model for correlation bias says that the 1990 cen- 
sus missed nearly 900,000 white males of whom only 13 less than 
0.002 percent — were between 20 and 30-years old. It also says that 
the 1990 census missed over three-quarters of a million black 
males but counted almost 30,000 too many black males under age 
10. Using that incredible model, the Census Bureau estimated that 



52 


about 38 percent of the adjustment is statistical bias. Without the 
model, the figure is 57 percent, almost 20 percent higher. With bet- 
ter assumptions, the estimated bias is even higher. The study I 
trust most puts the bias over 80 percent. Adjustment puts in far 
more error than it takes out. 

There’s another way to estimate the total population called de- 
mographic analysis. The 1990 adjustment adds more people than 
demographic analysis says were missed, including about a million 
extra women. Because of bias, the adjustment probably puts the 
people in the wrong place, making State shares worse. 

In summary, adjusting the census using sampling did not work 
in 1990 because of statistical bias. Taking a bigger sample, as pro- 
posed for the 2000 census, could make bias even worse. 

Thank you. 

[The prepared statement of Mr. Stark follows:] 



53 


Sampling to Adjust the 1990 Census for Undercount 

Prepared for 5 May 1 998 hearing of the 

United States of America 
House of Representatives 
Subcommittee on the Census 


Philip B. Stalk 
Department of Statistics 
University of California 
Berkeley, CA 94720-3860 


I thank Chairman Miller and the other members of the Subcommittee for inviting me to speak about the 1990 census 
adjustment 

We know from experience that overall, the census misses some people. 1 The undercount is different in different 
places. 2 That leads to errors in state population shares. For many purposes, including distributing Federal funds 
and congressional representation, state shares matter more than the total U.S. population.* I will focus on tire 
accuracy of state shares. 

It would be wonderful to know how many people the census missed, and where. Then we could add them where 
they belong. That would adjust for the undercount, and improve state shares. But we do not know: the missing 
people were not counted. 

'Sampling is selecting part of a population to represent the Whole. TheCensus Bureau used sampling to estimate ifae 
1990 Decennial Census undercount, so they could adjust for it The official 1990 census numbers were not 
adjusted. For the 2000 Decennial Census, there is a proposal to use sampling to adjust for undercount, and to use 
sampling to follow up some people who do not mail back their census forms. 1 will talk only about using sampling 
to adjust for undercount 

The 1990 and 2000 adjustments have different names and different details, 4 but they are based on die same 
statistical methods, so much of what 1 say about the 1990 adjustment applies to 2000 as well. 

Would the adjustments have improved the 1990 census? Probably not, because of statistical bias . 

Adjustment has two kinds of error: sampling error, which comes from tbe luck of the draw — die blocks that 
happen to be in the sample — and systematic error or bias, which comes from bad data, processing errors, and 
wrong assumptions, among other things. 

Bias is a technical term: it does not mean someone is intentionally skewing the results. Sampling errors tend to 
average out Bias does not 

Making estimates from a sample is like shooting a rifle. Each shot hits the target in a different place. Sampling error 
is the scatter in the shots. Bias is a tendency for all the shots to be off in the same direction, for example, to the left 
You fix bias in a rifle by sighting it in. That is straightforward, because you can see where die shots land. 

Fixing statistical bias in a census adjustment is bard. You only get one shot (because you only take one sample), 
and you cannot see where the shot lands (because you do not know the true undercount). Tlie 1990 adjustment 
process was extremely complex, 5 so it is very hard to track down all its biases.* 


1 



54 


For example, months after calculating the adjustment, the Census Bureau found that a coding error had inflated the 
undercount estimate by 1,000,000 people 7 — about 20% of the adjustment. That’s bias. Studies show that 40% to 
more than 80% of the 1990 adjustment is bias. 1 Adjustment could easily make the census worse instead of better. 

New York, Pennsylvania, and Illinois lose shares in the adjustment 9 Texas and Arizona gain shares. Arguably, it is 
easier to count people in Dallas and Phoenix, for example, than in the Bronx, Philadelphia, and Chicago, where die 
inner cities are denser. 10 Taking shares from New York, Pennsylvania and Illinois might be right — or it might be 
bias from bad assumptions. 

Some claim that adjustment would have made the 1990 census more accurate." Their technical arguments depend 
on statistical models. 12 The models are false," and have bizarre consequences. For example, the model for 
•‘correlation bias” says that the 1990 census missed nearly 900,000 white males, of whom only 13 — less than 
0.002% — were between 20 and 30 years old. It also says that the 1 990 census missed over three quarters of a 
million black males, but counted almost 30,000 too many black males under age 10. M 

Using that incredible model, the Census Bureau estimated that about 38% of the adjustment is statistical bias. 
Without the model, the figure is 57%, almost 20% higher." With better assumptions, die estimated bias is even 
higher. The study 1 trust most 14 puts the bias over 80%: adjustment puts in far more error than it takes out 

There is another way to estimate the total population, called Demographic Analysis. 17 The 1990 adjustment adds 
more people than Demographic Analysis says were missed, including about a million extra women. 11 Because of 
bias, the adjustment probably puts the people in the wrong place, making state shares worse.' 9 

In summary, adjusting the census using sampling did not work in 1990, 20 because of statistical bias. Taking a bigger 
sample, as proposed for the 2000 census, could make bias even worse. 


Technical Notes 

1 Edmonstoo, B. and Scbultze, C. eds n 1995. Modernizing the U.S. Census, Nations) Academy Press, 
Washington, D.C. 

2 Ibid. 

3 Ibid. 

4 The 1990 procedure is called the Dual-System Estimator (DSE), which uses data from the Post-Enumeration 
Survey (PES). The 2000 procedure is called Integrated Coverage Measurement (I CM). 

Both PES/DSE and 1CM take a random sample of blocks after the census is taken, and tabulate the people found in 
the households in those blocks who were missed by the census (omissions), as well as the people in the census who 
should not have been counted in those blocks (erroneous enumerations). Results are pooled for the blocks in the 
sample to get the fractions missed and erroneously enumerated, for various groups of people, called “post-strata.* 1 
For example, black male renters age 30-44 living in the central city of a major metropolitan area in New England 
comprised one 1990 PES post-stratum. There were 1,392 PES post-strata in aU. 

The basic idea in the adjustment is that the fraction of people in a post-stratum who were in the sample blocks, but 
not in the census, is an estimate of the fraction of all tire people in the post-stratum that the census missed. The 
fraction in the census in a post-stratum in the sample blocks, but not in the PES, is an estimate of the fraction of 
people in the post-stratum the census enumerated erroneously. The difference estimates the undercount rate for the 
post-stratum. Dividing the census count by (100% - undercount rate) adjusts for fee undercount 


2 



55 


This is just a sketch: the details of determining whether or not there is a match, treating missing data, and combining 
numbers from different blocks to estimate fractions in post-strata are extremely complex; see Hogan, H., 1993. The 
1990 Post-Enumeration Survey: Operations and Results, J. Amer. Statist Assoc., 88, 1047-1060. 

5 Hogan, H., loc. cit. 

6 There is a great deal of information in Committee on Adjustment of Postcensa) Estimates, 1992. Asessment of 
Accuracy of Adjusted Versus Unadjusted 1990 Census Base for Use in Jntercensal Estimates, Bureau of the 
Census (C.A.P.E. Report). Here are some excerpts: 

“...additional research detected some errors and made some refinements to the levels of undercount originally 
reported in the spring of 1991.” C.A.P.E. Report, p2. 

The table on p3 of the C.A.P.E. Report show that uncertainty estimates (for sampling ercor alone) were increased by 
as much as 300%. 

“As a result of an error in computer processing, the estimated national undercount rate of 2.1% was overstated 
by 0.4%. After correcting the computer error, the national level of undercount was estimated to be about 1.7%. 
After making other refinements and corrections, the national undercount is now estimated to be about 1 .6% [the 
figure is 1.58% in attachment 3, Table 2) ... The level of total bias, excluding correlation bias, on the revised 
estimate of undercount is negative 0.73 (-0.73%).” C.A.P.E. Report, pi 5. 

Thus (2.1 - 1.58 + 0.73)/2.1 « 60% of the original estimate of 2.1% is bias. The report continues, evaluating the 
“revised” estimates, which correct the coding error and use different post-strata: 

“Therefore, about 45% (0.73/1.58) of the revised undercount is actually measured bias and not measured 
undercount. In 7 of the 10 evaluation strata, 50% or more of the estimated undercount is bias," C.A J*JE. 
Report, pi 5. 

7 C.A.P.E. Report, pi 5. 

8 According to the Director of the Bureau of the Census, 

“A significant amount of bias remains. The research estimates that, at the national level, removing all biases 
from the PES estimates would lower the estimated undercount from 1 .6 to 1 .3 percent. When the effect of 
correlation bias is not taken into account ... the estimated undercount would fall to 0.9 percent.** 

Bureau of the Census, 1993. Decision of the Director of the Bureau of the Census on Whether to Use 
Information From the 1990 Post-Enumeration Survey (PES) To Adjust the Base for the Jntercensal Population 
Estimates Produced by the Bureau of the Census ACTION: Notice of final decision. Federal Register 58 FR 69. 
(cited as 58 FR 69 henceforth) 

That yields bias estimates of 38%-57%, depending on the treatment of “correlation bias.” Correlation bias has to do 
with the fact that a key assumption in the capture-recapture model is false: everyone in a post-stratum in the sample 
blocks does not have the same chance of being found by the PES, and the same chance of being found by the 
census. For example, if there are people unreachable by any survey, the PES cannot detect that they are are missing 
from the census. That would tend to make the undercount estimate smaller than the true undercount. The 
correlation bias estimate uses a model to disaggregate Demographic Analysis figures to local levels. Evidence cited 
below (note 14) shows how unreasonable the model is. 

The figure of over 80% comes from Breiman, L., 1994. The 1991 Census Adjustment: Undercount or Bad 
Data? Statistical Science, 9, 458-537. Breiman combines information from various Bureau of the Census 
evaluation studies. Sources of bias include fabrications by interviewers, matching errors, census day address errors, 
bias in the ratio estimator, people discovered to be out-of-scope in rcinterview, late census data, and the computer 
coding enor. The following paragraphs are drawn from Breiman’s work. 

Small errors in the match rate can produce extremely large errors in the undercount estimates. For example, in one 
block cluster, an unmatched family of 5 people added 45,000 to the undercount estimate. In Census Bureau studies 
of matching errors, match and rematch classifications disagree by 1.8%. A June, 1991, Census Bureau 
memorandum states: “...approximately 75 percent of the non-matching people could have been converted to a match 
if the search area had been expanded.*’ This is a huge source of bias. 


3 



56 


The match status of about 2% of the cases could not be resolved from the records or by interview. Depending on 
how these cases are treated, the PES estimates range from an overcount of 1,000,000 people to an undercount of 
9,000,000 people. See also Wachter, K.W., 1991. Recommendations on 1990 Census Adjustment, report to the 
Secretory of Commerce as o Member of Special Advisory Panels U.S. Department of Commerce, for “half-high” 
Bnd “half-low” estimates. 

The “probabilities” that unresolved cases were matches were imputed using a statistical model [Belin, T.R., et al~, 
1993. Hierarchical Logistic Regression Models for Imputation of Unresolved Enumeration Status in 
Undercount Estimation,/. Amer . Statist Assoc., 88, 3 349-1159] with obviously false assumptions [Wachter, 
K.W., 1993. Comment: Ignoring Nonignorable Effects, J. Amer. Statist Assoc., 88, 1161-1163], At least one 
explanatory variable in the model is missing for 28% of the unresolved PES cases, and 38% of the unresolved 
census-sample cases; those missing variables were also imputed. 

9 See Figure 1, pi 1 1, in Wachter, K.W., 1993. The Census Adjustment Trial: An Exchange, Jurimetrics, 34, 
107-115. California would have gained most by adjustment; Texas, second most. Pennsylvania would have lost 
most; Ohio, second most 

)0 lbU. 

1 1 For example, National Academy of Sciences reports recommend using sampling-based adjustments for the 2000 

Decennial Census; see Edmonston, and Schultze, C n ed., 1995. Modernizing the U.S. Census , National 
Academy Press, Washington, D.C. 460pp., and White, A.A ., and Rust, ed n 1997. Preparing for the 2000 

Census, National Academy Press, Washington, D.C. 98pp. 

I reviewed those reports. Their evidence is weak. The issue is whether the PES improves or degrades the accuracy 
of the census. That is very hard to determine, because the PES is subject to large biases that cannot be measured 
directly. Arguments in favor of the PES depend on the assumption that the errors in the PES generally go in the 
same direction as the true undercount, and seldom go too far. 

12 See, for example, Mulry, MJL, and Spencer, BJX, 1993. Accuracy of the 1990 Census and Undercount 
Adjustments, /. Amer. Statist Assoc., 88, 1080-1091. 

13 In addition to models relating various parameters, the assumptions include: 

Independence: This assumption has two parts. First, for each individual, in the sample blocks, being caught in the 
census is independent of being caught by the PES. Second, the probability of being caught in the census is 
the same for every individual in a given post-stratum within the sample blocks, as is the probability of 
being caught in the PES. 

Synthetic Assumption (Homogeneity): In each block that was not sampled, the nonresponse rate is a weighted 

average of the nonresponse rates of the post-strata that intersect the block. The weights are the proportions 
of people in the block in the post-strata. 

Violation of the independence assumption leads to "correlation bias:” see note 14. There are a number of studies of 
the synthetic assumption using proxy variables, for example, Hengartner, N., and Speed, 1993. Assessing 
Between-Block Heterogeneity Within Post-Strata of the 1990 Post-Enumeration Survey, J. Amer. Statist 
Assoc., 88, 1119-1129, and Freedman, D. and Wachter, K., 1994. Heterogeneity and Census Adjustment for 
the Intercensa) Base, Statistical Science, 9, 476-485. 

Those studies find that heterogeneity within post-strata is significant. According to the Director of the Bureau of the 
Census, 

“...it is possible that errors due to heterogeneity in fact could be l&Tge; than all other sources of citot in the 
adjustment” 58 FR 69 
The CA.P.E. also studied heterogeneity: 

“The Panel cautioned that artificial population analysis ... was inconclusive about whether the homogeneity 
assumption held.” CAJJ). Report, p30. 

But their analysis had flaws: 


4 



57 


“A first analysis showed similar homogeneity for the 1,392 design as well as the 357 design as well as for a 
design with only 2 strata.” C.AJPX. Report, p26. 

They also state: 

“The level of bias in the PES was close to the point where artificial population analysis shows that homogeneity 
assumption fails to hold.” C.A J*.E. Report, p26. 

14 The model for disaggregating “conflation bias” from national DA estimates down to local levels is in Bell, 

W.R., 1993. Using Information from Demographic Analysis in Post-Enumeration Survey Estimation, J. Amer. 
Statist Assoc., 88, 1106-1118. The consequences of that model for the cited demographic groups is on p533 of 
Freedman, D., and Wachter, K., 1994. Rejoinder, Statistical Science, 9, 527-537. 

The C.A.P.E. also had reservations about the model of correlation bias: 

“The fourth cell in the DSE is an estimate of the number of people missed in both the PES and the census..Both 
the Committee and die Panel of Experts were very concerned about the negative values in the fourth 
cell... correlation bias should be a component of total enor. However, there was concern about our method of 
estimating it and very serious concern about the method of allocating it.” C.A JPJE. Report, pp22-23. 

“The Census Bureau ... knew of no adequate methodology to remove the bias by state, city, etc.” CAJX 
Report, p30. 

Sec note 8. 

16 Breiman, L„ loc. cit 

17 Robinson, J.G., Ahmed, Das Gupta, P., and Woodrow, K.A., 1993. Estimation of Population Coverage 
in the 1990 United States Census based on Demographic Analysis, J. Amer. Statist Assoc., 88, 1061-1079. 

18 there was concern that the PES estimated a higher population than DA and estimated about a million more 
women than DA.” C.AJP.E. Report, p27. 

19 According to the Director of the Bureau of die Census, 

“...no survey — either the high quality, well controlled and interviewed PES of 170,000 households or a larger 
one - can be used to make post-census fine tuning of an average undercount as small as 1 .6 percent in all types 
of places, counties, and states at a level of accuracy beyond that by which surveys are usually judged.... there is 
little or no evidence adjustment would improve the quality of substate estimates...” 58 FR 69. 

20 “...there is no intention to adjust the 1990 census because research shows insufficient technical justification.” 
CAPI. Report, p33. 


5 



58 


Mr. Miller. Mr. Darga. 

Mr. Darga. Thank you. My name is Kenneth Darga, and I am 
a demographer working for the State of Michigan in the Depart- 
ment of Management and Budget. We routinely provide input to 
the Census Bureau through various Federal-State cooperative pro- 
grams involving population estimates, population projections, and 
the State Data Center program. My first involvement with popu- 
lation undercount adjustment was in response to the Census Bu- 
reau’s invitation for States to provide input to their decision on 
whether or not to adjust the population estimates base for 
undercount in the 1990 census. 

I would like to thank Chairman Miller and all the members of 
the Subcommittee on the Census for inviting me to speak with you 
today about census undercount adjustment. At this time, I would 
like to submit two papers for the record which I will then summa- 
rize briefly. 

Mr. Miller. Without objection, thank you. 

[The information referred to follows:] 



59 


Two Papers on Census Undercount Adjustment: 

• Straining Out Gnats and Swallowing Camels: 

The Perils of Adjusting for Census Undercount 

• Quantifying Measurement Error and Bias in the 
1990 Undercount Estimates 


Kenneth Darga 

Office of the State Demographer 

Michigan Information Center 

Michigan Department of Management and Budget 


April 29, 1998 




60 


EXECUTIVE SUMMARY 

"Straining Out Gnats and Swallowing Camels: The Perils of Adjusting for Census Undercount’’ 
’’ Quantifying Measurement Error and Bias in the 1990 Undercount Estimates” 

April 29, 1998 


There is reason to believe that the proposed remedy for Census undercount would be far 
worse than the undercount problem itself. 

The proposed method of adjusting for undercount involves conducting a sample survey to 
identify people who were missed by the Census and people who were counted twice or 
counted in the wrong location. In order to succeed, this survey has to secure participation by 
the people who were missed by the Census, and it has to be very accurate in matching 
individuals counted by the sample survey with individuals counted by the Census. 
Unfortunately, these are impossible tasks: there are too many people who do not want the 
government to know where they are, and there are too many obstacles to matching the results 
of the two surveys successfully. 

The undercount adjustments that were developed by this method for the 1990 Census seemed 
plausible at first glance, but they were strongly affected by several types of error in 
classifying people as missed or not missed by the Census. The first paper (“Straining Out 
Gnats and Swallowing Camels: The Perils of Adjusting for Census Undercount”) shows that 
the proposed approach makes high levels of error inevitable and that the resulting 
adjustments have indeed been seriously flawed. The second paper (“Quantifying 
Measurement Error and Bias in the 1990 Undercount Estimates”) identifies and quantifies 
several specific types of error: 

• survey matching error 

• fabrication of interviews 

• ambiguity or misreporting of usual residence 

• geocoding errors 

• unreliable interviews 

• unresolvable cases. 

Together, these papers show that many of the people who were missed by the Census were 
missed by the coverage survey as well, and that many of the people who were identified as 
missed by the Census actually do not seem to have been missed at all. 

Thus, in addition to reflecting differences in actual undercount rates, the adjustments derived 
from the sample survey reflect differences in the rate of error in classifying people as 
undercounted. Applying such adjustment factors to the Census would decrease the accuracy 
of local population counts and of the many detailed tabulations that are relied upon by all 
levels of government and by myriad private users of demographic data. These errors would 
usually be small, but they would sometimes be errors of 10%, 20%, or more. Since no one 
would know which areas and which population groups had serious errors, and since the 
errors would not be consistent from one Census to the next, all findings based on Census data 
and all comparisons between different time periods would come into question. In an attempt 
to address an inaccuracy at the national level, we would utterly destroy the reliability of 
Census data at the state and local level. 




61 


Straining Out Gnats and Swallowing Camels: 
The Perils of Adjusting for Census Undercount 


Kenneth Darga 

Office of the State Demographer 

Michigan Information Center 

Michigan Department of Management and Budget 


April 29, 1998 





62 


Although the Department of Commerce is often criticized for Census undercount, 
it is not surprising that every Census misses a portion of the population. In fact, 
what is noteworthy is not that the undercount persists, but rather that the net 
undercount appears to have been less than 5 million people in 1990, or only about 
1.8% of the population. 1 

A major reason for the undercount — although not by any means the only reason — 
is that quite a few people do not want their identities known by the government. 
For example, the United States has over 1 million people who do not make any of 
their required payments on court ordered child support 2 and an estimated 5 
million illegal immigrants. 3 Each year, the police make over 14 million arrests 
for non-traffic offenses. 4 Millions of additional criminals remain at-large, many 
people would lose government benefits if the actual composition of their 
households were known, and many people have other reasons for concealing their 
identity and whereabouts from the government. If the Census misses fewer than 5 
million people under these circumstances, then the Census Bureau is doing a truly 
remarkable job. 

Nevertheless, eliminating even this small error would be a valuable achievement. 
Although the impact on many components of the population would be small, 
people in some demographic and economic categories are undercounted more than 
others. This leads to anomalies and imprecision in some analyses and affects 
political apportionment and fund distribution. The Census Bureau has therefore 
tried very hard to devise ways to measure and compensate for the problem of 
undercount. 

Obviously, these methods are intended to make the Census count better. However, 
we need to evaluate their actual effects instead of their intended effects. Before we 
decide to use these particular methods in the official population count for the year 
2000, we need to determine whether they would make that population count better 
or worse. 


1 U S. Department of Commerce, "Census Bureau Releases Refined 1990 Census Coverage Estimates from 
Demographic Analysis," Press Release of June 13, 1991, Table 1. 

2 Economics and Statistics Administration, U.S. Department of Commerce, "Statistical Brief: Who Receives Child 
Support?," May 1995. 

3 U.S. Immigration and Naturalization Service, "INS Releases Updated Estimates of U.S. Illegal Immigration," 
Press Release of February 2, 1997. 

4 U.S. Department of Justice, Bureau of Justice Statistics, Sourcebook of Criminal Justice Statistics, 1995, p. 
394. 


- 1 - 



63 


After reviewing some of the reasons for believing that censuses miss a portion of 
the population, this paper briefly describes the Census Bureau’s proposed method 
of adjusting for undercount. It will then be shown that, although the results of this 
method for 1990 appeared plausible, at least at the broadest national aggregation, 
the method cannot produce reliable adjustments for undercount: It is not capable 
of counting many of the people who are missed by the Census, it is very sensitive 
even to extremely small sources of error, and it is subject to many sources of 
error that are very serious. Thus, it is not surprising to find that many of the 
detailed undercount measurements for 1990 were implausible and, in some cases, 
demonstrably false. In an effort to correct a net national undercount of less than 
2%, spurious undercounts of 10%, 20%, and even 30% were identified for some 
segments of the population. Adjustments derived from these measurements would 
have had a devastating impact on the usefulness and accuracy of Census data at the 
state and local level, and they would have had an adverse effect upon nearly all 
purposes for which Census data are used. Similar problems can be expected with 
the undercount adjustment proposed for Census 2000: The problems are not due 
to minor flaws in methodology or implementation, but rather to the impossibility 
of measuring undercount through the sort of coverage survey that has been 
proposed. 

The Evidence of Undercount. Before examining the Census Bureau's method 
of adjusting for undercount, it is instructive to consider how we can know that 
each Census misses part of the population. 

One way to find evidence of undercount is to project the population for a Census 
year by applying mortality rates and migration rates to the results of other 
censuses. The pattern of differences between these projections and the actual 
Census counts can provide good evidence for undercount. For example, if the 
count of black males age 20 to 24 is lower than would be expected based on the 
number of black males age 10 to 14 in the previous Census, and if it is lower than 
would be expected based on the number of black males age 30 to 34 in the 
following Census, then there is good evidence of undercount for that segment of 
the population. 

The most widely accepted method for measuring Census undercount is called 
“demographic analysis.” Using a combination of birth registration data, estimates 
of under-registration, mortality rates, estimates of international migration, social 
security enrollment data, and analyses of previous censuses, the Census Bureau 
develops estimates of the national population for each Census year by age, race, 
and sex. Although they are not perfect, the gap between these estimates and the 


- 2 - 



64 


national Census count provides the best available measure of undercount. The 
pattern of undercount suggested by demographic analysis is generally consistent 
from one Census to another, and it is consistent with the discrepancies that are 
found between population projections and Census counts: Undercount rates appear 
to be higher for males than for females, higher for blacks than for whites, and 
higher for young adults than for people in other age groups. 5 

Demographic analysis suggests that the net national undercount fell in each 
successive Census from 5.4% of the population in 1940 to only 1.2% in 1980. 
This reflects improvements in Census-taking methodologies, special efforts 
focused on segments of the population that are hard to count, and assurances that 
Census information will be kept strictly confidential. However, the estimated net 
undercount rose to 1.8% in the 1990 Census: still quite low by historic standards, 
but disappointing because it represents an increase relative to the previous Census. 
(See Figure 1.) 

A major shortcoming of this method is that it works only at the national level: 
There is too much interstate and intrastate migration to allow a phenomenon as 
subtle as Census undercount to be visible at the state or local level through 
demographic analysis. Since we can expect undercount to vary considerably from 
state to state and neighborhood to neighborhood, we cannot simply apply the 
national undercount rates to state and local population counts. This would not 
adjust some areas enough, and it would introduce inaccuracies into areas where 
there had not been inaccuracies before. 

Figure I 


Estimates of Census Undercount 
Based on Demographic Analysis 6 


Population Category 

1940 

l 195<> 

1966 

| 1970 | 

1980 

1990 

Total Population 

5.4% 

4.1 % 

3.1 % 

2.7% 

1.2% 

1.8% 

Black 

Non-Black 

8.4% 

5.0% 

7.5 % 
3.8% 

6.6% 

2.7% 

6.5% 

2.2% 

4.5% 

0.8% 

5.7% 

1.3% 


Undercount Rate for Total Population 

6% 

3 % 

0% 

1940 1950 1960 1970 1900 1990 



5 J. Gregory Robinson et. al. t “Estimation of Population Coverage in the 1990 United States Census Based on 
Demographic Analysis," Journal of the American Statistical Association, 88(423): 1061- 1079, 

6 Ibid., p. 1065. 


- 3 - 




65 


Calculating Adjustments for Undercount. The Census Bureau has therefore 
tried to develop additional methods to estimate how well the Census covers each 
segment of the population. Immediately after the Census count is complete, the 
Bureau conducts a “coverage survey” which essentially repeats the population 
count for a small sample of census blocks. The coverage survey was called the 
“PES” or “Post-Enumeration Survey” in 1990, and it will be called “ICM” or the 
“Integrated Coverage Measurement Survey” in 2000. Data from the coverage 
survey are matched person-by-person with the original Census to identify the 
individuals counted by the coverage survey who seem to have been missed by the 
Census. These results are tabulated by relevant population characteristics to 
produce estimated undercount rates which can be applied to local areas based on 
their counts of persons with those characteristics. A sample of original Census 
forms are also matched with the coverage survey to identify individuals who were 
counted by the Census but omitted by the survey. These discrepancies are 
investigated and used to estimate “erroneous enumerations” or overcount. 

Plausibility of the Adjustments. The resulting adjustment to the 1990 Census 
was quite plausible at the broadest national level. After moving up and down as 
corrections were made to the data and new statistical techniques were applied, the 
estimate of overall net undercount at the national level was 1 .6 % 7 — very close to 
the 1.8% suggested by demographic analysis. The credibility of the 1990 coverage 
survey was increased by the fact that it suggested high rates of undercount at the 
national level for the groups that would be expected to have high undercounts, 
such as Hispanics, blacks, people with difficulty speaking English, people in 
complex households, and people living in non-standard housing units. 8 Thus, 
one is tempted to conclude that the data from a coverage survey can provide an 
incredibly accurate measure of Census undercount. 

Implausibility of the Adjustments. Before drawing that conclusion, 
however, we must consider a much less incredible interpretation: The differences 
between the coverage survey and the original Census may not represent net 
undercount as much as they represent the difficulty of matching individual records 
between two surveys. At a very broad level of aggregation, this methodological 
difficulty can produce results that look very much like net undercount because the 
population groups which are hard to match between surveys are generally the 

7 Howard Hogan, "The 1990 Post-Enumeration Survey: Operations and Results," Journal of the American 
Statistical Association, 88(423): 1047- 1060, 1993. 

8 Manuel de la Puente, U.S. Bureau of the Census, "Why Are People Missed or Erroneously Included by the 
Census: A summary of Findings From Ethnographic Coverage Reports," report prepared for the Advisory 
Committee for the Design of the Year 2000 Census Meeting, March 5, 1993. J. Gregory Robinson and Edward 
L. Kobilarcik, U.S. Bureau of the Census, "Identifying Differential Undercounts at Local Geographic Levels: A 
Targeting Database Approach," paper presented at the Annual Meeting of the Population Association of America, 
April 1995. 


- 4 - 



66 


same groups that are hard to count. It is only by considering the tremendous 
barriers to measuring undercount accurately and by examining the detailed 
findings of the 1990 PES that we are led to accept this alternate interpretation. If 
this interpretation is correct, it has very clear implications for how the next 
Census should be conducted: Adjusting the new Census based on a coverage 
survey would negate the findings from 100 million Census forms based on a 
statistical artifact. 

For a coverage survey to measure net undercount with anything approaching an 
acceptable level of accuracy, it must accomplish two impossible tasks. The 
impossibility of these tasks should lead us to question its validity even if it appears 
on the surface to provide a good measure of undercount. In particular, we should 
not conclude that the Census Bureau has accomplished the impossible merely on 
the basis of plausible results for the broadest national aggregation. If the detailed 
results do not make sense as well, then it is untenable to suggest that undercount 
has been measured with a high level of precision. 

The first impossible task that a coverage survey must accomplish is to secure 
participation by two particularly problematic components of the population that 
are not counted well by the Census: homeless people and people who do not want 
to be counted. Each Census includes a major effort to count people in shelters and 
on the streets, but it undoubtedly misses a large portion of this population. A 
coverage survey is not well equipped to measure this component of the undercount 
because many homeless people are not likely to be found in the same place a few 
weeks or months later when the survey is conducted. The Census Bureau 
understands the impossibility of this task, and the 1990 PES therefore did not 
even attempt to address this portion of the undercount . 9 A coverage survey 
does not fare much better with the the other problematic component of the 
population. It is hard to imagine that very many of the people who avoided being 
counted by the Census are likely to be counted by a second survey that has 
essentially the same limitations. If drug dealers, fugitives, and illegal immigrants 
were afraid to fill out the Census form that everyone in the nation was supposed to 
receive, they are not likely step forward a few weeks or months later when their 
household is singled out for a visit by another government enumerator. On the 
contrary, they are likely to avoid the coverage survey even more studiously than 
they avoided the Census. Thus, we cannot believe that a coverage survey provides 
a good measure of undercount unless we are first willing to believe that 
somehow — without the tools necessary to do so — it manages to secure participation 
by these two groups of people who were not counted well by the Census. 

9 Howard Hogan, op. cil. 


- 5 - 



67 


If a coverage survey misses many of the same people who were missed by the 
Census, then the only way it can suggest a plausible level of undercount is by 
identifying other people as missed by the Census when they really were counted. 
This leads us to the second impossible task which a coverage survey must 
accomplish: achieving a practically-perfect replication and matching of Census 
results for that vast majority of the population which is counted correctly the first 
time. The problem is that, for every hundred people missed by a Census, there 
are about 3,000 people who were counted and can therefore be mistakenly 
identified as missed. These 3,000 people will inevitably include a certain number 
of challenging cases involving aliases, language barriers, individuals and 
households that have moved, people with no stable place of residence, and a host of 
other difficulties. It doesn't take a large error rate in classifying these 3,000 
people who were correctly counted by the Census to completely invalidate our 
attempt to count the 100 people who were missed — especially since many of the 
people who were missed are making every effort to be missed again. A 
hypothetical example will help to demonstrate why even a 99% level of accuracy is 
not sufficient, and a review of the barriers faced by a coverage survey will 
demonstrate why 99% accuracy is not likely to be achieved. 

Let’s say that the next Census has an undercount of 3% and an overcount of 1%, 
for a net undercount of 2%. Let us also assume that the next coverage survey 
somehow manages to identify all of the people who are missed by the Census and 
all of the people who are counted twice or counted in error. This is a very 
generous assumption, since we have already seen that we have good reason to 
believe that this is an impossible task. Finally, let us assume that the coverage 
survey achieves 99% accuracy in classifying the individuals who were counted by 
the Census. 

The apparent undercount will then include that 3% of the population which had 
been missed by the Census, plus nearly another 1 % that had actually been counted 
correctly. This is because 1% of the 97% not missed by the Census will be 
falsely identified as undercounted because we achieve “only” 99% accuracy in 
replicating and matching the Census results. Thus, even under these unrealistically 
favorable assumptions, about 25% of the apparent undercount will actually 
represent classification error. 10 The measure of overcount will be even more 
problematic: It will include that 1% of the population that had actually been 

1 0 Expressed as a proportion of the actual population, the people counted by the Census who are mis-classified as 
uncounted in this hypothetical example will be (1.00 * .03) * (1.00 - .99) = .0097, where .03 is the assumed rate 
of undercount and .99 is the assumed level of accuracy. If we assume that all of the actual undercount will be 
detected through the coverage survey, the total estimate of undercount will be .03 + .0097 = .0397. Expressed as 
a proportion of the identified undercount, the people who are mis-classified as uncounted will therefore be .0097 / 
.0397 = .2443, or approximately 25%. 


- 6 - 



68 


overcounted, plus nearly another 1% that had been counted correctly the first 
time. This means that about 50% of the apparent overcount will actually represent 
classification error. 11 This would hardly be a firm basis for fine-tuning the 
Census count. 


Why the Word “American” Is Abbreviated in Census Questions 

When you are trying to measure a small component of the population — such aS^reople who 
have been missed by the Census — it is necessary to avoid even very small errors in 
classifying that vast majority of the population which is not part of the group being 
measured. 

This principle is illustrated by one of the problems that the Census Bureau found while it 
was testing different ways of asking its new Hispanic-origin question for the 1980 Census. 
A very small number of people with no Mexican heritage thought that the category “Mexican 
or Mexican-American" meant “Mexican or American.” Since they were “American,” they 
thought that this category applied to them. Unfortunately, since people of Mexican heritage 
represented only about 4% of the national population, even this very small error among the 
remaining 96% of the population was enough to completely invalidate the count of Mexican- 
Americans. In fact, for many areas, a majority of the people selecting this category were 
found to be “Americans” with no Mexican heritage. 

The 1980 Census therefore used the category “Mexican or Mexican-Amer.” This was a big 
improvement, but the 1980 post-enumeration survey found that non-Mexicans still 
represented a majority of the people choosing this category in some areas with a very low 
population of Mexican- Americans. The 1990 Census therefore used the category “Mexican 
or Mexican- Am.” This cleared up the problem. 

A very similar difficulty arises when you try to measure undercount with a coverage survey. 
It is sometimes very hard to match up the people that you counted in the coverage survey 
with the people that you counted in the Census. When you make a mistake, people can be 
counted as missed by the Census or as mistakenly included in the Census when they really 
weren't. Since there are about 97 of these potential mistakes for every 3 people who were 
really missed by the Census, even a very low error rate is enough to completely invalidate 
the measure of undercount. Unfortunately, although the problem is very similar, the 
solution is not: Errors in matching surveys cannot be prevented by anything as simple as 
using more abbreviations. 


1 1 Expressed as a proportion of the actual population, the people counted by the Census who are mis-classifred as 
counted in error will be (1.00 - .03) * (1.00 - .99) = .0097, where .03 is the assumed rate of undercount and .99 is 
the assumed level of accuracy. If we assume that all of the actual overcount will be detected through the coverage 
survey, the total estimate of overcount will be .01 + .0097 = .0197. Expressed as a proportion of the estimated 
overcount, the people who are mis-classified as counted in error will therefore be .0097 / .0197 = .4924, or 
approximately 50%. 


- 7 - 


69 


A coverage survey must therefore achieve far more than 99% accuracy in 
classifying the people who are correctly counted by the Census. But is it possible 
to achieve such a high level of accuracy? Even for simple surveys conducted 
under ideal conditions, a 99% level of accuracy would be impressive. 
Unfortunately, the Census and the coverage survey are not simple, and they are 
not conducted under ideal conditions. The attempt to match the results of these 
two surveys must contend with a wide array of daunting problems, some of which 
are listed in the box on the following page. These problems are more than just 
hypothetical illustrations: many of them have been documented and quantified by 
analysts from the Census Bureau and elsewhere, who confirm that the undercount 
analysis involves very serious levels of matching error and other error. (See 
accompanying paper, “Quantifying Measurement Error and Bias in the 1990 
Undercount Estimates.”) Thus, in addition to knowing from logical arguments 
and hypothetical illustrations that serious problems are inevitable, we know from 
experience that serious problems actually do occur. 

In place of our previous assumptions that a coverage survey measures overcount 
and undercount perfectly and that it matches the correct findings of the Census 
with 99% accuracy, we should therefore consider the implications of a somewhat 
more modest level of success. Let’s say that the next coverage survey identifies 
30% of the actual undercount and 40% of the actual overcount, that the 
undercount analysis averages an impressive 96.2% rate of accuracy in replicating 
and matching the correct results of the Census, and that the overcount analysis 
averages a similarly impressive 97.3% rate of accuracy. Although classification 
error would then account for an overwhelming 80% of the people identified as 
undercounted and 87% of the people identified as overcounted, the estimated net 
undercount at the national level would be the same 1.6% that was suggested by the 
coverage survey for 1990. 12 In other words, the estimate of undercount 
would primarily reflect errors in matching survey responses with Census 
responses, yet the broadest national estimate of net undercount would appear very 
plausible. 


1 2 Expressed as a proportion of the actual population, the the people counted by the Census who are mis-classified 
as uncounted in this hypothetical example will be (1.00 - .03) * (1.00 * .962) - .03686, where .03 is the assumed 
rate of undercount and .962 is the assumed level of accuracy. If we assume that 30% of the actual undercount will 
be detected through the coverage survey, the total estimate of undercount will be (.03 * .30) + .03686 = .04586. 
Expressed as a proportion of the identified undercount, the people who are mis-classified as uncounted will 
therefore be .03686 / .04586 = .8038, or approximately 80%. 

The people counted by the Census who are mis-classified as counted in error will be (1.00 - .03) * (1.00 - .973) - 
.02619, and the total estimate of overcount will be (.01 * .40) + .02619 = .03019. Expressed as a proportion of 
the identified overcount, the people who are mis-classified as counted in error will therefore be .02619 / .03019 = 
.8675, or approximately 87%. The estimate of net undercount will be .04586 - .03019 = .01567 or 1.6%. 


- 8 - 



70 


AN IMPOSSIBLE TASK 

The Census Bureau tries to measure undercount by carefully taking a second survey for a sample of small 
geographic areas and comparing its results to the Census to see which persons had been missed. But is it 
possible to achieve a near-perfect match between these two surveys? This effort has to deal with daunting 
problems such as these: 

• Illegible handwriting. 

• Similarity of names. 

• Use of different nicknames and other variations on names in different surveys. 

• Names which do not have a consistent spelling in the English alphabet. 

• Use of aliases by illegal immigrants, fugitives, and others who place a very high value on privacy. 
Some people have more than one alias, some may use different names on different surveys, and some 
may be known to neighbors by names that are different from the ones used on the Census. 

• Irregular living arrangements, complex households, and households with unstable membership. 

• Differences which arise from collecting most Census information through written forms and collecting 
information for the coverage survey through personal interviews. 

• Households and individuals that move between the Census and the coverage survey. (This is 
particularly a problem for college students, recent graduates from high school or college, and people 
who migrate between northern and southern states on a seasonal basis. Many of these people move 
within a few weeks after the April Census.) 

• Differences which arise from having different household members provide information for the 
different surveys, or from having a responsible household member provide information for the Census 
and a child, neighbor, or landlord provide information for the coverage survey. (For example, 
differences in the reported name, age, race, or marital status can make it difficult to determine whether a 
person found by the coverage survey is really the same person found by the Census.) This problem 
was compounded in 1990 because the survey to measure undercount was centered around the Fourth 
of July weekend and the survey to measure “erroneous enumerations” was centered around the 
Thanksgiving weekend. It is very difficult, for example, to survey a college town during Thanksgiving 
week to determine who was living there the previous April. 

• Language barriers. Language barriers are a particularly serious problem for a coverage survey because 
it relies upon personal interviews instead of on a written survey that respondents can complete with help 
from friends or other family members. 

• People who are included on the Census but avoid inclusion on the coverage survey because they do not 
want to be identified by government authorities. 

• Homeless or transient people who are enumerated in one housing unit by the Census but are in a 
different housing unit or on the streets at the time of the coverage survey. 

• Homeless or transient people who are enumerated in the streets by the Census but are found in a 
housing unit by the coverage survey. 

• Information that is fabricated by the enumerator or by the respondent. 

• Clerical errors and processing errors. 

• Failure to follow complex procedures precisely. 

• Census forms which are coded to the wrong geographic area, making it impossible to to match them 
with the proper survey results. 

• People who give an inaccurate response when they are asked where the members of their household 
were living on April Fools Day. 




71 


To people who are interested only in the national count of total population, the 
hypothetical example above may not appear very troubling. After all, since this 
example assumes that the errors in measuring undercount are largely offset by the 
errors in measuring overcount, the national population total it produces is actually 
closer to the assumed true population than the unadjusted Census count. What 
makes this example troubling is the fact that the undercount adjustments are relied 
upon for far more than a national population total. They purport to tell us which 
segments of the population and which parts of the country are undercounted more 
than others. The critical point that needs to be understood is that, if the coverage 
survey really does fail to measure a large portion of the undercount and if it 
mistakenly identifies people as missed by the Census who really weren't, then the 
differential undercounts it suggests will largely reflect differences in the amount 
of error in measuring undercount rather than differences in the amount of 
undercount itself. What would we expect such adjustments to look like? To put it 
simply, we would expect them to look just like adjustments developed from the 
1990 Post-Enumeration Survey. 

Figure 2 


Alternate Estimates of Undercount 
for the 1990 Census 13 



1 3 The undercount estimates based on the PES are from Barbara Everitt Bryant, “Census-Taking for a Litigious, Data 
Driven Society,” Chance: New Directions for Statistics and Computing , Vol. 6, No. 3, 1993. The estimates 
based on demographic analysis are from U.S. Department of Commerce, "Census Bureau Releases Refined 1990 
Census Coverage Estimates from Demographic Analysis," Press Release of June 13, 1991, Table 1. 


- 10 - 




72 


At the national level, it would not be surprising for the undercount adjustments to 
look fairly reasonable: Since the population groups that are hard to match 
between two surveys are generally the same groups that are hard to count in the 
Census, we would expect the findings for very broad components of the population 
to be at least roughly similar to the results of demographic analysis. Of course 
they wouldn’t be identical, since the level of difficulty in matching each group 
between surveys does not correspond precisely to the level of difficulty in 
counting it for the Census. For example, some problems such as language barriers 
and aliases pose more difficulty in survey-matching than in taking a Census, and 
segments of the population that are counted very well in the Census are at the 
greatest risk of having classification error exceed the actual level of undercount. 
Thus, while advocates of adjustment have not considered the pattern of differences 
displayed in Figure 2 to be unreasonable, the final national PES results for 1990 
are actually quite different from the estimates based on demographic analysis even 
for very broad population groups. The apparent undercount for black males is 
42% less than the rate suggested by demographic analysis, and the rate for white, 
Native American, and Asian/Pacific females is 50% higher. Under most 
circumstances, these differences would be considered very substantial. 

We would expect an even worse situation below the national level. If the measure 
of net undercount is more sensitive to variations in the rate of classification error 
and other survey problems than to variations in the actual rate of undercount, it 
would not be surprising to find some serious deviations from the orderly pattern 
that would be found in a practically-perfect analysis. For example, it would not be 
surprising for the adjustment factors to look something like the ones displayed in 
Figure 3. 

Figure 3 shows some of the initial undercount adjustments for children under age 
10 which the Census Bureau developed based on the 1990 PES. This age group 
was chosen for this analysis because there is no obvious reason to expect 
householders to mis-report their young male children at a significantly different 
rate from their young female children. It is therefore disconcerting that these 
undercount adjustments for 1990 include some very large differences between 
boys and girls in this age group. In fact, these eighteen pairs of figures were 
selected for the table because they each have a discrepancy of over ten percentage 
points. It is even more disconcerting that these differences follow no discernible 
pattern. Sometimes the adjustment for boys is higher, but sometimes the 
adjustment for girls is higher; in one place black renters have a higher adjustment 
for boys, but in another place they have a higher adjustment for girls; in some 
places the gender discrepancy for whites is similar to the gender discrepancy for 


- 11 - 



73 


blacks, but in other places it is the opposite; sometimes one race category in a 
large city has a higher adjustment for boys, but another race in the same city has a 
higher adjustment for girls. It is not surprising when signs of estimation error are 
visible for small components of the population in small geographic areas, but here 
we see apparently arbitrary adjustments for even the largest population groups in 
some of the largest cities and across entire regions. Thus, the adjustment factors in 


Figure 3 

Selected Undercount Adjustments for Children Under Age 10 
from the 1990 Post-Enumeration Survey 14 


Region 

■KSSH 

Tenure 

Race 

Adjustments | 



iisia 

Pacific 

Non-Central 

Cities 

Renter/ 

Owner 

Asian/Pacific 


+ 17% 

Mid 

Atlantic 

Central Cities in 

New York City PMSA 



+ 25% 

+ 9% 

East North 
Central 

Central Cities in Metro Areas 
w/ Central City > 250K 


Black 

+ 26% 

+ 15 % 

Pacific 

Central Cities in 

Los Angeles PMSA 

Owner 

Black 

+ 28% 

+ 8% 

Mid 

Atlantic 

Central Cities in 

New York City PMSA 

Owner 

Black 

+ 0% 

+ 23% 

South 

Atlantic 

Central Cities in Metro Areas 
w/ Central City>250K 

Renter 

Black 

+ 26% 

+ 16% 

Pacific 

Central Cities in 

Los Angeles PMSA 

Renter 

Black 

+ 20% 

+ 10% 

Pacific 

Non-Central 

Cities 

Renter/ 

Owner 

Black 

+ 31 % 

+ 6% 

Mid 

Atlantic 

Non-Central Cities in Metro 
Areas w / Central City > 250K. 


IHEjESflEgsg 

+ 2% 

+ 16% 

Mid 

Atlantic 

All Central 

Cities 




PPPM 

West South 

Central 

Central Cities in Houston, 

Dallas, * Fort Worth PMSA’s 



+ 8% 

+ 19% 

South 

Atlantic 

All Non-Metro Areas & 

All Non-Central Cities 



+ 9% 

+ 22% 

West South 

Central 

Central Cities in Metro Areas 
w / Central City > 250K 


BSUBBi 


+ 11 % 

East North 

Central 

Central Cities in Metro Areas 

w / Central City > 250K 


nnm 

+ 21 % 

+ 4 % 

East North 

Central 

Central Cities in Detroit 

and Chicago PMSA's 

Renter 

White, Native Am., & 

Asian/Pacific except Hisp. 

-4% 

+ 14% 

West South 

Central 

Central Cities in Houston, 

Dallas, & Fort Worth PMSA's 

Renter 

White, Native Am., & 
Asian/Pacific except Hisp. 

+ 7% 

+ 21 % 

South 

Atlantic 

Central Cities in Metro Areas 
w/o Central City > 250K 

Renter/ 

Owner 


+ 10% 

-1 % 

South 

Atlantic 

Non-Meuo Areas 

Except Places > 10K 

Renter/ 

Owner 

White, Native Am., & 

Asian/Pacific except Hisp. 


+ 16% 


14 U.5. Department of Commerce, Bureau of the Census. Unpublished file dated 6/14/91 containing adjustment 
factors derived from the 1990 Post-Enumeration Survey, prior to application of a statistical smoothing procedure. 
These adjustment factors reflect the amount of apparent net undercount actually measured in the PES sample for 
the indicated geographic areas and demographic groups. 


- 12 - 




































74 


Figure 3 suggest a high level of measurement error 15 rather than the high level of 
precision required for an adequate estimate of undercount. 

Would the Adjustments Increase or Decrease Accuracy? The PES 
findings in Figure 3 provide a good basis for testing whether we can trust a 
coverage survey when it tells us that some population groups have higher 
undercounts than others. We have seen that these apparent undercounts seem to be 
implausible, but that by itself does not prove that they did not happen. If we can 
confirm that these differential undercounts did take place, then the credibility of 
coverage surveys as a tool for measuring undercount will be greatly increased. 
On the other hand, if it can be demonstrated that they did not take place, then the 
credibility of coverage surveys will be lost: If a coverage survey can indicate 
large undercount differentials where they do not exist, then it is obviously not a 
very reliable tool for measuring undercount. 

Fortunately, because the ratio of male to female children is one of the most stable 
of all demographic statistics, these adjustment factors can be tested quite 
definitively. For each of the nation's nine regions, 51% of the young children 
enumerated in the 1990 Census were boys and 49% were girls. Likewise, for each 
of the major race categories, 51% of the young children enumerated were boys 
and 49% were girls. Among the nation's 284 metropolitan areas and consolidated 
metropolitan areas, the percent of young children who were boys varied very 
little, ranging from a low of 50.3% in Pine Bluff, Arkansas, to a high of 52.1% in 
Topeka, Kansas. Therefore, if the large differential undercounts indicated in 
Figure 3 really did take place, they should be very obvious: Boys should represent 
less than 51% of the total for areas with a large undercount of boys, but they 
should represent more than 51% of the total for areas with a large undercount of 
girls. Furthermore, if the undercounts indicated by the coverage survey really did 
take place, we should expect each area to move closer to the norm after it is 
“corrected” for Census undercount. 

In fact, however, we find just the opposite. Figure 4 shows that the percentage of 
children under age 10 who are boys is about the same not only in each region, 
each race, and each metropolitan area, but also in the areas for which the 

1 5 There are several types of measurement error. Although the point being made here is that the large amount of 
error in the adjustments is consistent with the thesis that large amounts of non-sampling error are inevitable, it 
should be noted that sampling error is also a very serious problem for the undercount adjustments. Actually, 
there is more than enough enor to go around: these adjustments can reflect a very large amount of sampling error 
as well as a very large amount of non-sampling error. For purposes of data quality, both types of error are very 
problematic. 


- 13 - 



75 


Figure 4 


Before Adjustment for Undercount 

Percent of Children Who Are Boys Offers No Surprises 16 



Dramatic Variations in Percent of Children Who are Boys 17 



1 6 The percent of children who are boys was calculated based on the 1990 Census of Population and Housing, U.S. 
Department of Commerce, Bureau of the Census, Summary Tape File 1-C. Because Census counts by age, race, 
sex, and tenure have not been published, this table does not include the nine pairs of adjustments in Figure 3 
which apply only to renters or only to homeowners. Although the race distinctions which are made in Summary 
Tape File 1-C do not correspond precisely to the race distinctions upon which the undercount adjustments were 
calculated, these discrepancies involve a very small number of people and they do not significantly affect the 
present analysis. Black Hispanics are counted as Hispanic in STF 1-C, but they should not be included with 
other Hispanics for purposes of applying undercount adjustments. Likewise, Asians/Pacific Islanders of Hispanic 
origin are counted as Asians/Pacific Islanders in STF 1-C, but they should not be included with that group for 
purposes of applying undercount adjustments. 

1 7 The data in Figure 5 were calculated after applying the adjustment factors from Figure 3 to Census counts from 
Summary Tape File 1-C 


- 14 - 















76 


coverage survey found large undercount differentials between boys and girls. It is 
only after applying these adjustments derived from the coverage survey that 
serious anomalies are found. As shown by Figure 5, the percentage of children 
who are boys deviates dramatically from the norm after adjustment. Even though 
Pine Bluff and Topeka are “outliers” among the nation’s metropolitan areas, the 
adjusted Census counts are two to six times as far from the norm as Pine Bluff 
and Topeka. Thus, these “undercounts” measured in the PES sample do not 
correspond at all to actual undercounts in the areas which the sample represents. 
The Census is not really broken until after it is fixed. 

The point being made here is not merely that the 1990 coverage survey produced 
faulty undercount measurements for young boys and girls. The problem is much 
broader than that, since the difficulties discussed in this paper apply just as much 
to other age groups as to children, and just as much to other demographic 
characteristics as to the sex ratio. The foregoing analysis focuses on the sex ratio 
of children merely because sex ratios provide a convenient and definitive basis for 
demonstrating the implausibility of the undercount measurements below the age 
where school attendance, military service, and employment patterns cause 
different communities to have a different mix of males and females. The focus on 
the sex ratio of young children should not by any means imply that undercount 
measurements are worse for this age group or that they would affect sex ratios 
more than the other population and housing characteristics that are measured by 
the Census. In the absence of any known problem that would scramble the 
undercount measurements for boys and girls without affecting the figures for 
other age groups and other demographic characteristics, we have to suspect that 
the measurements are faulty in other respects as well. The point being made is 
therefore nothing less than this: Because the large undercount differentials shown 
in Figure 3 are clearly spurious, we cannot trust a coverage survey to tell us which 
segments of the population have higher undercounts than others. 

Does It Make a Difference? It may take a few moments to comprehend the 
impact that adjustment factors like those displayed in Figure 3 would have if they 
were applied to the Census. 18 To those of us who have become accustomed to 

1 8 The adjustment factors in Figure 3 reflect the amount of apparent net undercount actually measured in the PH S 
sample for the indicated geographic areas and demographic groups. It should be noted that these factors were 
subsequently subjected to a statistical “smoothing” procedure to produce new factors that followed a more 
consistent pattern by age, race, and sex. It was these "smoothed” factors that were actually proposed in 1991 for 
use in adjusting the 1990 Census. Further modifications proposed in 1992 for use in adjusting the population 
base for population estimates would have combined males and females under age 17. The resulting "collapsed" 
adjustment factors represent the Census Bureau’s latest official estimate of undercount in the 1990 Census. The 
“smoothed” adjustment factors would be appropriate for use in estimating the practical impact of adjusting the 
1990 Census data for undercount. The "unsmoothed” adjustment factors are pertinent for the current analysis, 


- 15 - 



77 


Census data that generally make sense at the local level, it is mind-boggling to 
consider the prospect of largely arbitrary adjustments — and sometimes arbitrarily 
large ones — applied to every number in the Census. In an effort to address a 
relatively small inaccuracy at the national level, we would utterly destroy the 
reliability of Census data at the state and local level. 

Perhaps most alarming is the impact on comparisons over time. If coverage 
surveys can indicate large differential undercounts between boys and girls even 
where no differences exist, they can also indicate large differential undercounts 
between one Census and the next where no differences exist. To illustrate the 
potential implications of this problem, let us consider what would happen if there 
turns out to be no real difference in certain undercount rates for Census 2000 and 
Census 2010, but the coverage surveys indicate the same spurious differences 
between these two points in time that the 1990 PES found between boys and girls. 
Under these assumptions, the numbers in Figure 3 could all remain the same, 19 
but they would represent spurious undercount differentials between Census 2000 
and Census 2010 instead of spurious undercount differentials between boys and 
girls in 1990. This would generate many interesting demographic “findings”: 

• The counts of Asians/Pacific Islanders in non-central cities of the Pacific region 
would be inflated by 5% in 2000 but by 17% in 2010. (See line 1 of Figure 3.) 
The adjusted Censuses would therefore suggest far greater growth in the 
number of Asians than actually occurred. What effect would this have on 
attitudes toward Asian immigrants in these communities? 

• The count of black homeowners in central cities of the Los Angeles PMSA 
would be inflated by 28% in 2000 but by only 8% in 2010. Similarly, the 
count of black renters would be inflated by 20% in 2000 and by 10% in 2010. 
(See lines 4 and 7 of Figure 3.) The adjusted Census data would therefore 
show a large exodus of the black population and a substantial drop in black 
home ownership for Los Angeles relative to the actual trend. What impact 
would this have on race relations? What would be the impact on government 
housing programs and anti-discrimination programs? 


since they reflect the amount of apparent undercount actually identified by the PES. The unsmoothed factors are 
also relevant in the context of Census 2000, since the Census Bureau does not plan to use a statistical smoothing 
process in the next Census. 

5 9 Our assumption that “the undercount adjustments indicate the same spurious differences between these two 
points in time that the 1990 PES found between boys and girls” does not require the adjustments themselves to 
be the same as the 1990 adjustments for boys and girls, but merely for the differences to be the same. The 
numbers “could” remain the same, but they would not necessarily have to. For simplicity and clarity of 
presentation, the illustrations are based the special case in which the adjustments are the same. 


- 16 - 



78 


• The count of black homeowners in central cities of the New York City PMSA, 
on the other hand, would be inflated by 0% in 2000 and by 23% in 2010. (See 
line 5 of Figure 3.) This area would therefore seem to have a dramatic rise in 
black home ownership relative to the actual trend. Of course, home ownership 
would not by any means be the only variable affected by these faulty adjustment 
factors: Poverty, marital status, and every other characteristic that is 
correlated with race and with home ownership would also be affected. Social 
scientists could spend the decade trying to explain why the economic status of 
blacks seemed to rise so rapidly in New York city while it seemed to decline in 
Los Angeles. What would be the impact on the credibility of the Census when 
they discovered the answer? 

• The counts of White, Native American, and Asian/Pacific renters in Detroit and 
Chicago would be decreased by 5% in 2000, but they would be inflated by 
1 1% in 2010. Thus, there would seem to be a dramatic increase in renters and 
a shift away from home ownership in these cities relative to the actual trend. 
(See line 15 of Figure 3.) In contrast, other central cities in these same 
metropolitan areas would have their counts for these demographic categories 
inflated by 21% in 2000 and by only 4% in 2010. (See line 14 of Figure 3.) 
The faulty adjustment factors would therefore make it appear that huge 
numbers of white renters had moved from Detroit and Chicago to other nearby 
central cities before 2000, but that they moved back in the next decade. 

Of course, these illustrations are only hypothetical. Perhaps Los Angeles will have 
reasonable undercount adjustments for black homeowners in 2000 and 2010. 
Maybe its adjusted Census data will show a spurious decline in its elderly 
population instead, and maybe it will be New York that shows a spurious decline 
in black home ownership. We won’t know before it happens. Even worse, we 
won’t know even after it happens. When adjusted Census data suggest a dramatic 
change in population trends, we will not know how much of the change represents 
actual demographic shifts and how much represents spurious differences in 
undercount adjustments. Are we ready to discover dramatic new (and totally 
false) trends in disease prevalence, mortality rates, school enrollment, income 
distribution, housing patterns, marital status, welfare dependency, gender 
differences, and all of the other issues that are studied on the basis of Census data? 
We expect a Census to increase our knowledge about population trends, but an 
adjustment methodology which can indicate large differentials where differentials 
do not exist would increase our ignorance instead. 


- 17 - 



79 


Conclusion. We cannot escape the conclusion that the method proposed for 
correcting Census undercount has some rather serious shortcomings. The impact 
on the validity of the 1990 Census would have been devastating, and we can expect 
the impact on Census 2000 to be similar: The problems are not due to minor 
flaws in methodology or implementation, but rather to the impossibility of 
measuring undercount through the proposed coverage survey. Unless we can 
convince people who don’t want to be counted to answer our surveys, and unless 
we can replicate and match the valid Census results with near-perfect accuracy, 
any undercount estimates that are developed in this manner will be dominated by 
measurement error. Instead of describing variations in the amount of undercount 
from one area to another, they will largely describe variations in the amount of 
error in replicating the Census and in matching individuals identified by the 
survey with individuals identified by the Census. Once the impossibility of the 
task is recognized, one can only be impressed by how close the Census Bureau 
seemed to come to succeeding in 1990. However, one must also be impressed by 
how close we are to destroying the credibility and the value of the Census. 


- 18 - 



80 


Quantifying Measurement Error and Bias 
in the 1990 Undercount Estimates 


Kenneth Darga 

Office of the State Demographer 

Michigan Information Center 

Michigan Department of Management and Budget 


April 29, 1998 



81 


The opening pages of the preceding paper 1 set up a paradox: Since the number 
of people who want to avoid being identified by the government is more than 
sufficient to account for the level of undercount identified through demographic 
analysis, and since many of these people can be counted upon to avoid the 
coverage survey as well as the Census, how is it that the 1990 coverage survey 
suggests about the right level of total undercount at the national level? 

The solution I have proposed is that this “correlation bias” — i.e. missing many of 
the same people in both the coverage survey and the Census — is offset by 
counting some people as missed by the Census when they really were included. I 
have suggested that, rather than just reflecting undercount, the undercount factors 
derived from the coverage survey reflect a variety of methodological difficulties 
involving imperfect replication of the census, survey matching, unreliable 
interviews, geocoding problems, and the like. 

The preceding paper demonstrates that this is a plausible solution to the paradox 
and that it is consistent with both the plausible undercount estimates at the 
national level and the implausible estimates for individual poststrata. It shows 
that, although an extremely high level of accuracy is required for an adequate 
measure of undercount, the obstacles to an accurate coverage survey are 
immense. It points out many specific types of error that are difficult or 
impossible to avoid, and it shows that the proposed undercount adjustments for 
1990 were suggestive of high levels of error. 

Even these limited accomplishments of the paper are significant: Proponents of 
the proposed undercount adjustment are left with the task of explaining how the 
1990 coverage survey could indicate very large and demonstrably spurious 
differential undercounts for young children. In addition, they must explain how 
we can rely upon the 5%, 10%, and 20% differential undercounts identified 
between other poststrata when the 5%, 10%, and 20% differential undercounts 
identified between young boys and girls are known to be spurious. They must 
make a believable argument that the coverage survey somehow really did count 
critical groups of people who were missed by the 1990 census, i.e. homeless 
people and the illegal immigrants, drug dealers, fugitives, and others who don't 

1 Kenneth J. Darga, “Straining Out Gnats and Swallowing Camels: The Perils of Adjusting for Census 
Undercount,” Office of the State Demographer, Michigan Information Center, Michigan Department of 
Management and Budget, 1998. 


-l- 



82 


want the government to know where they are. They must demonstrate either that 
they achieved extremely low error rates in the face of seemingly insurmountable 
obstacles, or else that — notwithstanding the demonstrated inaccuracies of the 
undercount measurements for some individual poststrata — they have enough luck 
and skill to ensure that large errors will offset each other very precisely. Merely 
a general tendency for errors to offset one another is not enough: An extremely 
high level of accuracy is required to measure a phenomenon as small and elusive 
as census undercount at the sub-national level. Each of these issues is critical to 
the success of the effort to measure undercount. The credibility of the proposed 
method cannot be restored unless its proponents are successful on all of these 
points. 

A major limitation of the preceding paper is that, although it suggests what sorts 
of errors are difficult or impossible to avoid, it stops short of showing that those 
errors actually occurred or how serious they were. To fill this gap in the 
analysis, this paper relies upon evaluation studies by the Census Bureau and the 
work of other analysts. That work confirms that the errors are very large 
indeed, and that they did not offset each other precisely in the analysis of the 
1990 coverage survey. 

The Census Bureau has extensively evaluated the process and results of the 1990 
coverage survey, which is commonly referred to as the “Post-Enumeration 
Survey” or “PES.” Its findings are written up in 22 unpublished reports, eight of 
which are referenced in this paper. These reports, which are known as the “P- 
project reports,” were issued in July 1991 under the main title “1990 Post- 
Enumeration Survey Evaluation Project.” These reports are referred to in this 
paper by their number within the series, e.g. “P-4” or “P-16.” Most of the 
references to these reports and many of the other quantitative observations which 
appear below are based upon the work of Dr. Leo Breiman, an emeritus 
professor of statistics at the University of California, Berkeley (Breiman, 1994). 

Six major sources of error are quantified below: matching error, fabrication of 
interviews, ambiguity or mis-reporting of usual residence, geocoding errors, 
unreliable interviews, and the number of unresolved cases. It will be seen that 
the level of error and uncertainty contributed by each of these factors is very 
substantial relative to the magnitude of net undercount. Thus, each of these error 
sources by itself is sufficient to demonstrate that the sort of coverage survey used 
by the Census Bureau is not capable of accurately measuring Census undercount. 
It will then be shown that the various identified sources of error actually did 
increase the 1990 undercount estimate enough to explain the paradox. 


- 2 - 



83 


1 . Matching Error 

A critical step in measuring undercount through a coverage survey is to match 
people counted in the coverage survey with people counted in the Census. Most 
people are counted by both surveys, but problems such as misspellings, 
misreporting of age, language barriers, aliases, missing data, errors in recording 
the address, changes in household composition, and a host of other difficulties can 
make it difficult to match up the records. Any failure to match the records can 
lead to an overestimate of undercount: The person's record in the Post- 
Enumeration Survey — sometimes referred to as the “P-Sample” — can be 
mistakenly counted as having been missed by the Census. Yet their Census 
response — the Census enumerations from the same geographic areas are 
sometimes referred to as the “E-sample” — cannot be classified as erroneous 
unless strict criteria are met. 2 (After all, it is a valid record.) Thus, when 
records fail to match, it is possible for people to be counted twice. The many 
barriers to matching the coverage survey results with the Census are described in 
a sidebar of the preceding paper, and their seriousness is confirmed by the results 
of the Census Bureau's evaluation studies. 

As explained in the P-8 report, a computer-matching process was able to resolve 
about 75% of the P-sample records, and the remaining records went to two 
independent teams of trained matchers. Although these teams used the same 
definitions and guidelines, they had a surprisingly high rate of disagreement 
regarding which people counted by the PES had been counted by the Census. Of 
people classified as “matched” by the first team, 5.7% were classified as “not 
matched” and 4.5% were classified as “unresolved” by the second team. Of those 
classified as “not matched” by the first team, 4.8% were classified as “matched” 
and 1 .3% were classified as “unresolved” by the second team. Of those classified 
as “unresolved” by the first team, 22.7% were classified as “matched” and 8.0% 
were classified as “unmatched” by the second team. (Ringwelski, 1991). 
Although the matching process must achieve near-perfection in order to 
accurately measure the 1% or 2% of the population that is missed by the Census, 
it is obviously a very difficult task, and even teams using the same guidelines can 
differ widely in their judgments. 


2 For example, Howard Hogan, then director of the Undercount Research Staff of the Census Bureau, wrote: 
“Proving that someone does not exist is not easy. . . . The rules require the interviewer to find at least three 
knowledgeable respondents in an effort to determine whether an enumeration was fictitious.” (Hogan, 1991a). 
This would be difficult to do in a case where an unmatched person really existed. 


- 3 - 



84 


This high level of disagreement has several serious implications: 

• First, it indicates that the number of “difficult” cases for which match status is 
not obvious is very large, greatly exceeding the estimated level of net 
undercount. This demonstrates the impossibility of measuring undercount 
accurately through a coverage survey even apart from any other 
considerations. 

• Second, since trained teams differ substantially in their judgments, it follows 
that some of the judgments reached by the final team of matchers are likely to 
be wrong: Some of the people counted by the Census will be identified as 
missed, some of the people missed by the Census will be identified as counted, 
some of the people counted correctly by the census will be identified as 
counted in error, and some of the people counted in error will be identified as 
counted correctly. If the number of difficult cases were small, we could hope 
that the errors would come close to cancelling each other out. However, given 
the high level of disagreement between the matching teams, any of these types 
of error could potentially exceed the actual level of undercount: “close” is 
therefore not enough. 

• Third, since high levels of subjectivity and art are obviously involved in the 
matching process, it is subject to additional sources of bias. Will the match 
rate be different if the cases are examined in the first week of matching or in 
the final week? Will the match rate be different depending on which regional 
office examines them? If a difficult case falls into a category that is expected 
to have a high undercount rate, will that decrease its likelihood of being 
classified as matched? If a similar case falls into a category that is expected to 
have a low undercount rate, will that increase its likelihood of being classified 
as matched? Such issues can have a significant impact on the differential 
undercount rates of individual poststrata and of different geographic regions. 
If matching were an objective process whose results could be fiilly determined 
by the Census Bureau’s matching rules, these questions would be insignificant. 
However, because the process is obviously a somewhat subjective one, these 
questions become very important. In fact, since the number of difficult cases 
is quite large and the level of disagreement between teams exceeds the total 
level of undercount, these questions must be considered critical. 

• A fourth implication of the high level of disagreement between different 
match teams is that the results for a given set of records are likely to be 
different each time the match is performed. Clear evidence of this is provided 


- 4 - 



85 


by the results of rematching selected blocks which initially had large numbers 
of non-matches and erroneously enumerated persons: Rematching only 104 
out of the 5,290 block clusters resulted in a decrease of 250,000 (about 5%) in 
the estimated net national undercount. (Hogan, 1993). 

2. Fabrication of Interviews 

The problem of fabricated data is another example of a data collection problem 
whose magnitude is very substantial relative to the magnitude of Census 
undercount. Many large surveys conducted by the Census Bureau appear to have 
a significant number of records that are fabricated by the interviewer. Previous 
research has shown that, overall, between 2% and 5% of the interviewers are 
dishonest in their data collection and that between 0.5% and 1.5% of the 
interviews themselves are fabricated (Stokes and Jones, 1989). One-time surveys 
such as the Census and the PES are particularly vulnerable to this problem, since 
temporary employees are found to be more likely to fabricate records than 
permanent employees. Workers who are detected fabricating data sometimes do 
so on a large scale. Biemer and Stokes (1989) found that, on average, 
inexperienced interviewers who were detected fabricating data did so for 30% of 
the units in their assignment; for more experienced interviewers, the rate was 
19%. 

While the prospect that perhaps 0.5% or 1.5% of the Census and PES interviews 
are fabricated may not sound extremely serious at first, it must be remembered 
that we are trying to measure a net undercount of only about 1% or 2% of the 
population. Thus, instead of saying that 0.5% and 1.5% are small relative to 
100%, it is more pertinent to say that they are very substantial relative to 1 % or 
2%. (Of course, it should be noted that undercount rates are higher than 1% or 
2% for some demographic groups and some types of area. However, that does 
not greatly affect this comparison, since fabrication rates also tend to be highest 
in the areas that are most difficult to enumerate. See Tremblay, 1991, and West, 
1991c). 

Both fabrication in the Census and fabrication in the PES have very serious 
implications for estimating undercount. When a block cluster with interviews 
that were fabricated by a Census enumerator is included in the PES, it will raise 
the rates of undercount and erroneous enumeration for the poststrata represented 
within it. Since, as already noted, it is difficult to prove that people do not exist, 
the increase in the apparent rate of erroneous enumeration may not be as great as 
the increase in the apparent undercount rate. This would lead to an overestimate 


- 5 - 



86 


of net undercount for these poststrata. Fabrication within the PES is even more 
problematic. When people counted by the PES are matched against Census 
questionnaires, any fabricated PES records can look like people who were missed 
by the Census. However, when the corresponding Census records are tested for 
validity, they are likely to be classified as valid: It is particularly difficult to 
prove that someone does not exist if they really do exist. Thus, fabrication once 
again can lead to an overestimate of net undercount. Fabricated PES records 
would be particularly difficult to detect in cases where the housing unit was 
vacant during the Census or during PES follow-up. 

The actual amount of fabrication in the PES is difficult to determine. The P-5a 
report, which is based on data which were not specifically designed to detect 
fabrication, identified only 0.03% of the cases in the P-sample evaluation follow- 
up data as fabrications (West, 1991b). These cases were estimated in the P-16 
report to have inflated the national undercount estimate by 50,000 persons, or 
about 1% of the total net undercount (Mulry, 1991). The P-5 report, on the 
other hand, used quality control data collected during the PES to identify 0.26% 
of the PES household interviews and 0.06% of the remaining cases on a national 
level as fabrications (Tremblay, 1991). Although this is a much lower rate of 
fabrication than would be expected based on the studies cited above, it is 
nevertheless about eight times the proportion of cases identified as fabrications in 
the P-5a report, suggesting that perhaps fabrications represent about 8% of the 
total net undercount. Yet another Census Bureau report on this issue, the P-6 
report, was designed to gain knowledge about fabrication that may have been 
undetected in the quality control operation. This report found that only 39% of 
the interviewers whose match rates were suggestive of high levels of fabrication 
had been identified in the quality control operation. (West, 1991c). This 
suggests that the level of fabrication in the PES may have been close to the level 
that has been found in other similar surveys, making it a very significant problem 
indeed. 

The P-6 report also found that fabrication rates seemed to vary substantially from 
one region to another. Interviewers who appeared to have high levels of 
fabrication accounted for 2% to 5% of the interviews in most regions, but they 
accounted for 7.7% of the interviews in the Atlanta regional office and 8.8% of 
the interviews in the Denver regional office (West, 1991c.). Regional variation 
in the amount of fabrication is not surprising, since important factors which are 
likely to influence the fabrication rate vary by region. For example, while PES 
interviews to identify undercount were being conducted at the end of June and 


- 6 - 



87 


into July of 1990, most of the northeast and midwest had very pleasant weather. 
Much of the south and west, on the other hand, had long periods with 
temperatures near or above 100 degrees. Denver, for example, had eleven 
consecutive days at the end of June and the beginning of July with temperatures 
of 95 degrees or higher, including five days with temperatures in the 100’s. 
Atlanta had seventeen consecutive days with temperatures of 89 degrees or 
higher, followed by several days of rain. Thus, it is not surprising that 
fabrication seems to have been a more serious problem in these areas. Moreover, 
since fabrication also varies substantially by neighborhood, with interviewers 
being more likely to fabricate records in neighborhoods they perceive as 
dangerous than in safer neighborhoods, it also varies by race and by owner/renter 
status. It therefore appears that fabrication can account for a substantial portion 
of the undercount differentials identified between regions, between types of city, 
and between population groups. 

3. Ambiguity or Misreporting of Usual Residence 

The question of where someone lives is often not as straightforward as it may 
seem. The Census uses the concept of “usual” address: If you are staying 
somewhere temporarily and usually live somewhere else, you are instructed to 
report your “usual” address instead of your address on April 1. For many 
people, this instruction is ambiguous and subject to varying interpretation. 
“Snowbirds” who migrate between the north and south can give the address 
where they spend the largest part of the year, the address where they spend the 
largest part of their life, the address where they are registered to vote, the 
address where they feel most at home, or the address where they happen to be on 
April 1. They might give one answer when they fill out their Census form in 
April and a different answer when they are interviewed for the coverage survey 
in July. Other people who move to or from temporary quarters at about the time 
of the Census can also claim a “usual” address different from the place where 
they were located on Census day. For example, college students who are packing 
up to move out of a dormitory room that they will never see again may use their 
“home” address instead of the college address that the Census Bureau would 
prefer. In comparison with an estimated national undercount of only 1% or 2% 
of the population, these components of the population with an indistinct “usual” 
place of residence represent a very significant component of the population. 

Thus, the task of determining the “appropriate” address for each Census 
respondent amounts to replacing the traditional concept of “usual” address, which 
is defined largely by the respondent, with a set of assignment rules developed by 


- 7 - 



88 


the designers of the coverage survey. This can involve the reassignment of large 
numbers of people, and it can potentially have a larger impact on regional 
population distribution than Census undercount itself. 

Given the large number of people with an indistinct “usual” place of residence, it 
not surprising that the Census Bureau's Evaluation Follow-Up Study found many 
P-sample respondents who were classified as non-movers for purposes of 
calculating the undercount adjustments, but were identified by new information as 
having moved in after census day. Weighted to a national level, they represented 
274,000 persons, 3 or about 5 % of the estimated national net undercount. (Of 
course, the impact on the individual poststrata that were most affected would have 
been greater.) It should be noted that these figures do not reflect the full 
magnitude of the problem of indistinct “usual” place of residence: they reflect 
only those cases — presumably a small minority — for which the PES was judged 
to have classified movers incorrectly. 

Finally, it should be noted that different cities and different neighborhoods can 
vary greatly in their proportion of people with an indistinct “usual” place of 
residence. If the sample drawn for particular poststratum happens to include 
some block clusters in a college town or in a retirement community, then its 
adjustment factor will be very strongly affected by this problem. The adjustment 
for a class of cities in an entire region can thus be determined largely by whether 
or not the sample includes a few “outlier” blocks. 

4. Geocoding Errors 

Another task which proves to be very difficult is coding addresses to the proper 
Census Block. Coding a record to the wrong Census block is a very serious 
problem for an undertaking that depends upon matching records between two 
surveys. If a Census record that belongs in a sample block has been mistakenly 
coded to a different block, it may not be found. The corresponding PES record 
would therefore be erroneously classified as missed by the Census. On the other 
hand, if an otherwise valid Census record has been mistakenly coded to the 
sample block, it may be counted as an erroneous enumeration when it fails to 

3 The P-4 report (West. 1991a) and P-16 report (Muliy, 1991) indicated that “census day address error' increased 
the undercount estimate by 81 1,000 persons. However, the Census Bureau subsequently indicated that this 
figure included other errors found by the P-sample re-interview as well (Breiman, 1994, p.475). The conclusion 
that 274,000 persons were found to have been added to the undercount estimate through incorrect assignment of 
Census-day address by the PES is based on subtracting these other errors, which represent 537,000 persons 
labeled “P-sample re-interview” in Dr. Breiman ’s paper, from the 81 1,000 persons initially identified as “census 
day address error” in the Census Bureau reports. (See Breiman, 1994, pp.467, 471, and 475.) 


- 8 - 



89 


match with a PES record and when residents of the block indicate that no such 
person lives there. To reduce the magnitude of these problems, both PES records 
in the P-sample and Census records in the E-sample were checked against one or 
two rings of surrounding blocks. According to the P-11 report, 4.08% of the P- 
sample was matched to the Census through geocoding to the surrounding blocks, 
but only 2.29% of the E-sample was classified as correctly enumerated as a result 
of matching with PES records in surrounding blocks. If matching to surrounding 
blocks had not been done, this difference would have been equivalent to an 
approximate excess of 4,296,000 in the P-sample population (Parmer, 1991, 
Attachment). 

This difference highlights the sensitivity of the PES analysis to variations in 
methodology and procedure. As pointed out by Dr. Leo Breiman: “The 
implication of this result is that, if the surrounding blocks search had not been 
done, then geocoding errors would have caused a doubling of the . . . national 
estimated undercount to over 4%. On the other hand, using a larger search area 
might well have produced a much lower undercount estimate.” (Breiman, 1994, 
p.468.) Since 38% of the households that were matched outside their proper 
block in the 1986 PES rehearsal were matched more than five blocks away 
(Wolter, 1987), an expanded search area might have had a very significant effect 
on the measure of undercount. 

The sensitivity of the PES analysis to small variations in methodology and 
procedure is also illustrated by another geocoding problem encountered by the 
PES. It was found that two particular block clusters initially increased the 
undercount estimate by nearly one million people due to faulty census geocoding. 
Most of the people in those blocks had been counted by the Census, but many of 
them were identified as uncounted because they had been erroneously coded as 
living in different blocks. It is somewhat disconcerting that only two block 
clusters out of a total of 5,290 included in the PES can erroneously contribute 
nearly one million people to the undercount estimate, especially since the total 
estimated net undercount is only about five million. Of course, in this case the 
problem was obvious enough to be identified: the influence of these block 
clusters was downweighted so that they contributed “only” 150,000 to the 
estimated undercount. (Hogan, 1991b). One has to wonder, however, how many 
similar problems may have gone undetected and uncorrected. 

5. Unreliable Interviews 

Another problem which the PES must contend with is unreliable interviews. 
Interviews can be unreliable for many reasons, including interviewer errors, 


- 9 - 



90 


language barriers, lack of information on the part of respondents (some of whom 
are children and some of whom are neighbors, landlords, or other non-members 
of the household), and lack of cooperation on the part of respondents (some of 
whom are criminals, illegal immigrants, psychotics, or practical jokers). The 
serious implications of this problem for measurement of undercount through a 
coverage survey are demonstrated in the P-9a report. The Evaluation Follow-Up 
project conducted new interviews for a sample of PES E-sample records. The 
new interview information was given to matching teams with instructions to 
change match status only if new, relevant, and reliable information was present in 
the new interview. The result was that 13% of the records changed match status. 
In fact, a majority of these changes (7% of the records examined) involved 
changes from “erroneous enumeration” to “correct enumeration” or vice versa; 
the remainder (6% of the records examined) involved changes from one of these 
categories to “unresolved” or vice versa (West, 1991d; Ericksen et. al., p.512). 
Although Ericksen et. al. stress the fact that the changes had a general tendency to 
cancel each other out and that they had fairly little effect on the net undercount 
estimates, the more pertinent implication for the present analysis is that a very 
substantial proportion of cases from the Post-Enumeration Survey had very 
uncertain match status. Whether these changes in match status are attributable to 
unreliable information in the initial interviews or merely to a tendency for match 
status to change each time a different team of matchers examines a difficult case, 
the fact remains that we are trying to measure a subtle phenomenon with a very 
crude instrument. Based on the findings in the P-9a report, weighted to reflect 
the national population, over 2 million persons would have changed from 
“correctly enumerated” to other classifications, and over 1.6 million persons 
would have changed from “erroneously enumerated” to other classifications 
(West, 1991d). In the context of a net national undercount of only about 5 
million people, the magnitude of these reclassifications suggests very serious 
problems resulting from unreliable interview data. 

6. Unresolvable Cases 

After all of the followup, review, and rematching involved in the 1990 PES, 
there were still 5,359 E-sample cases and 7,156 P-sample cases which remained 
unresolved and had to be imputed. This represents approximately 1.6% of the 
total combined P-sample and E-sample cases. On the one hand, the fact that the 
number was not larger is a testimony to the persistence and ingenuity of the PES 
staff. On the other hand, it must be noted that the percentage of unresolved cases 
was very close to the total percentage of the population that is believed to be 


- 10 - 



91 


undercounted. Thus, unresolved cases are not a small problem, but rather a 
problem that can have a critical impact on the undercount estimate. As Dr. 
Breiman notes, the undercount estimate would nearly double if all of the 
unresolved P-sample cases were assumed to be unmatched and all of the E-sample 
cases were assumed to be correctly enumerated, but the opposite assumptions 
would suggest a census overcount of one million persons (Parmer, 1991; 
Breiman, 1994, p.468). 

The match status of the unresolved cases was imputed through a complex 
regression model that involved estimating coefficients for dozens of variables 
(Belin, et.al., 1993). However, regardless of the complexity of the methodology 
or the carefulness of its assumptions, it must be recognized that the cases we are 
talking about here are all ones that could not be classified as matches or non- 
matches even after careful and repeated review of all of the information available 
about them. Very little is known about what proportion of unresolvable survey 
responses really do match with one another. An imputation process may be able 
to produce a “reasonable” allocation of records to matched and unmatched status, 
but it cannot classify them definitively. A “reasonable” allocation would be 
sufficient if the proportion of unresolved cases were very small relative to the 
rate of undercount, but it is not sufficient when the proportion of unresolved 
cases is nearly as great as the net rate of undercount. The large number of 
unresolvable cases is by itself a fatal flaw in the undercount analysis. 


Impact of Identified Sources of Error on the Undercount Estimate 

We have seen that the undercount measurements are subject to several serious 
sources of error. In order to determine whether these errors can serve as a 
solution to the paradox identified at the beginning of this paper, it is necessary to 
see whether their combined effect would elevate the undercount estimates enough 
to offset the tendency for the coverage survey to miss many of the same people 
that are missed in the Census. 

Several attempts have been made to quantify the net effect of identified 
measurement errors on the 1990 estimates of undercount. The analysis in the 
Census Bureau's P-16 report indicates that corrections for measurement errors in 
the 1990 PES would have decreased the undercount estimate from 2.1% to 1.4% 
(Mulry, 1991). A later analysis by the same author incorporated additional 
corrections related to a major computer processing error discovered by the 


- 11 - 



92 


Figure 1 

Impact of Identified Sources of Error 
on the 1990 Undercount Adjustments 


Error Source Impact on Undercount Estimate 4 

(i.e. number of persons 
erroneously added to undercount) 


P-sample rematching 553,000 

Census-day address errors 274,000 

Fabrications 50,000 

E-sample rematching 624,000 

E-sample re-interview -473,000 

P-sample re-interview 537,000 

Ratio estimator bias 290,000 

Computer coding error 1,018,000 

Late-late Census data 183,000 

New out-of-scopes in re-match 164,000 

New out-of-scopes in re-interview 358,000 

Re-interview of non-interviews 128,000 

TOTAL 3,706,000 

Estimate of identified net undercount prior to 

correction for identified errors: 5,275,000 

Estimate of identified net undercount after 

correction for identified errors: 1,569,000 


Note: The first seven of these error sources are considered in the P-16 report (Mulry, 1991), and 
the first nine error sources are considered in the subsequent Census Bureau report by the same 
author (Mulry, 1992). 

4 With the exception of the count of Census day address errors, these figures are taken from Tabic 15 of Breiman 
(1994). That tabic indicated 811,000 Census day address errors, based on the P-4 and P-16 reports. As 
explained in Footnote 3 above, that figure is corrected here to 274,000. This correction is also reflected in Dr. 
Breiman 's finding that correction of identified errors would lower the undercount estimate to 0.6%. Excluding 
that correction, Dr. Brciman’s adjusted undercount estimate was only 0.4%. 

It should be noted that, like the original PES estimates of undercount, these estimates of PES error are subject 
to both sampling error and non-sampling error. Moreover, it is likely that they fail to identify all of the 
problems of the PES. Nevertheless, these estimates are more than adequate for the present purpose of 
demonstrating that the 1990 coverage survey involved a very large amount of measurement error and that its 
identified errors are sufficient to explain the paradox laid out at the beginning of this paper. However, they 
should not be interpreted as producing a definitive estimate of the amount of “true” undercount that was 
identified by the 1990 PES. 


- 12 - 



93 


Census Bureau in late 1991, the rematching of records in some suspect blocks, 
and the inclusion of very late Census data that had not been available when the 
initial PES estimates were developed. This analysis suggested that corrections for 
identified measurement errors would have reduced the undercount estimate from 
2.1% to 0.9% (Mulry, 1992). An analysis by Dr. Leo Breiman, which built 
upon the Census Bureau analyses cited above, incorporated additional sources of 
error to arrive at an adjusted undercount estimate of only 0.6% (Breiman, 1994, 
p.475). This does not mean that the “true undercount” was only 0.6%, but 
merely that this is the amount of apparent undercount identified by the 1990 
coverage survey which remains after making rough adjustments for the errors 
that have been identified and documented. Dr. Breiman’s estimates of the impact 
of each error source, based on data from the Census Bureau evaluations, are 
shown in Figure 1. Dr. Breiman concludes that about 70% of the net undercount 
adjustment that had been proposed for the 1990 Census count — 3,706,000 out of 
5,275,000 persons — actually reflects identified measurement errors rather than 
actual undercount. 

Despite their differences, these three studies all point clearly to the same 
conclusion: There are enough measurement errors which inflate the undercount 
estimate to roughly offset the large number of people who appear to be missed by 
both surveys. This provides the solution for the paradox identified at the 
beginning of this paper. 

Thus, it appears that the 1990 coverage survey missed a very substantial number 
of people who were missed by the Census, but that it also identified a large 
number of people as missed by the Census who actually had been counted. 
Moreover, there is a large amount of additional error — far greater in magnitude 
than the level of undercount — which is less visible at the broadest level of 
aggregation because the errors in one direction are offset by errors in the other 
direction. Thus, while the 1990 coverage survey suggests an overall level of 
undercount similar to that indicated by demographic analysis, it cannot be relied 
upon to shed light on patterns of undercount for different demographic 
components of the population or for different geographic areas. The differential 
undercounts indicated by the coverage survey largely reflect differences in the 
incidence and direction of survey matching errors and other methodological 
problems rather than differences in the incidence of Census undercount. As 
noted in the preceding paper, this does not reflect deficiencies in the skill and 
effort applied to the task by the Census Bureau, but rather it reflects the 
impossibility of adequately measuring undercount in this manner. 


- 13 - 



94 


REFERENCES 

Belin, T.R., G.J. Diffendal, S. Mack, D.B. Rubin, J.L. Schafer, and A.M. Zaslavsky (1993). 
Hierarchical Logistic Regression Models for Imputation of Unresolved Enumeration Status 
in Undercount Estimation. Journal of the American Statistical Association, 88:1 149-1 166. 

Biemer, P.P., and Stokes, S.L. (1989). The Optimal Design of Quality Control Samples to 
Detect Interviewer Cheating. Journal of Official Statistics, 5, 23-39. 

Breiman, Leo (1994). The 1991 Census Adjustment: Undercount or Bad Data. Statistical 
Science, 9(4):458-S37. 

Hogan, Howard (1991a). The 1990 Post-Enumeration Survey: An Overview. Unpublished 
paper, U.S. Bureau of the Census. 

Hogan, Howard (1991b). Downweighting Outlier Small Blocks. STSD 1990 Decennial Census 
Memorandum Series #V-109, addressed to John Thompson, Chief, Statistical Support 
Division, Bureau of the Census, Washington D.C., June 18, 1991. 

Hogan, Howard (1993). The 1990 Post-Enumeration Survey: Operations and Results. Journal of 
the American Statistical Association, 88(423): 1047-1060. 

Mulry, MaryH. (1991). P-16 Report: Total Error in PES Estimates by Evaluation Post Strata. 
U.S. Bureau of the Census, 1990 Post-Enumeration Survey Evaluation Project, Series #R-6. 

Mulry, Mary H. (1992). Total Error of Post Census Review Estimates of Population. Decennial 
Statistical Studies Division, U.S. Bureau of the Census, Washington D.C., July 7, 1992. 

Parmer, Randall (1991). P-11 Report: Balancing Error Evaluation. U.S. Bureau of the Census, 
1990 Post-Enumeration Survey Evaluation Project, Series #M-2. 

Ringwelski, Michael (1991). P-8 Report: Matching Error — Estimates of Clerical Error from 
Quality Assurance Results. U.S. Bureau of the Census, 1990 Post-Enumeration Survey 
Evaluation Project, Series #1-2. 

Stokes, S.L. and Jones P.M. (1989). Evaluation of the Interviewer Quality Control Procedure 
for the Post-Enumeration Survey, American Statistical Association 1989 Proceedings of 
the Section of Survey Research Methods, 696-698. 

Tremblay, Antoinette (1991). P-5 Report: Analysis of PES P-Sample Fabrications from PES 
Quality Control Data. U.S. Bureau of the Census, 1990 Post-Enumeration Survey 
Evaluation Project, Series #E-4. 

West, Kirsten K. (1991a). P-4 Report: Quality of Reported Census Day Address. U.S. Bureau 
of the Census, 1990 Post-Enumeration Survey Evaluation Project, Series #D-2. 

West, Kirsten K. (1991b). P-5a Report: Analysis of P-Sample Fabrication from Evaluation 
Follow-Up Data. U.S. Bureau of the Census, 1 990 Post-Enumeration Survey Evaluation 
Project, Series #F- 1 . 

West, Kirsten K. (1991c). P-6 Report: Fabrication in the P-Sample: Interviewer Effect. U.S. 
Bureau of the Census, 1990 Post-Enumeration Survey Evaluation Project, Series #G-2. 

West, Kirsten K. (199 Id). P-9a Report: Accurate Measurement of Census Erroneous 
Enumerations. U.S. Bureau of the Census, 1990 Post-Enumeration Survey Evaluation 
Project, Series #K-2. 

Wolter, Kirk M. (1987). Technical Feasibility of Adjustment. Memorandum to the Undercount 
Steering Committee, Bureau of the Census, Washington D.C. 


- 14 - 



95 


Mr. DARGA. It’s no surprise that the census doesn’t count every- 
body. The census has a hard time counting people who don’t trust 
the Government or who don’t want the Government to know where 
they are. The census doesn’t do a very good job counting homeless 
people, either, and there are many other factors that make a com- 
plete count very difficult. So, the Census Bureau tries to fix the 
problem by counting people in some neighborhoods a second time 
and then comparing the results person-by-person with the census. 

In 1990, this method seemed to find just about all the people in 
this sample of neighborhoods who were missed by the census. This 
sounds great until you realize what’s really happening, the 1990 
Post Enumeration Survey didn’t really find all the people who were 
missed. People who didn’t want to be counted the first time, didn’t 
want to be counted the second time, either. And, the Post Enu- 
meration Survey didn’t even try to count homeless people. But it 
did find quite a few people who looked like they were missed by 
the census when they really weren’t. In fact, most of the people 
that the Post Enumeration Survey identified as missed by the cen- 
sus really weren’t missed by the census. That’s a surprising claim. 
How can you know that it’s true? There are at least two ways; a 
theoretical approach and an empirical approach. 

First, the theoretical approach; on pages 6 through 9 of my first 
paper, you will find a very simple and very basic statistical phe- 
nomenon that explains why serious problems are inevitable when 
you try to measure undercount with a coverage survey. These 
pages show that an effort to measure a small component of the 
population, such as people missed by the census, is very sensitive 
even to extremely small sources of measurement error. The cov- 
erage survey has to contend with a lot of very large sources of 
measurement error. So it shouldn’t be surprising that the coverage 
survey identifies a lot of people as missed by the census when they 
really weren’t. It would be a lot more surprising — unbelievable, in 
fact — if it didn’t. 

You can also see the problems with the undercount adjustments 
by taking an empirical approach. The Census Bureau evaluated the 
1990 Post Enumeration Survey quite extensively, and it did a very 
impressive job of documenting its shortcomings. I also want to ac- 
knowledge the valuable work of Leo Breiman, of the University of 
California, at Berkeley, in evaluating the Census Bureau’s evalua- 
tions. 

My second paper discusses six very serious sources of error that 
were documented by the Census Bureau: survey matching error, 
fabrication of interviews, ambiguity or misreporting of usual resi- 
dence, geo-coding errors, unreliable interviews, and unresolvable 
cases. And the Census Bureau didn’t document just a little bit of 
error. One thing that the theoretical approach and the empirical 
approach have in common, is that they both demonstrate very large 
amounts of error in the Census Bureau’s adjustments for 
undercount. The adjustments based on the Post Enumeration Sur- 
vey reflect errors in measuring undercount even more than they re- 
flect undercount itself. 

Now, you might think that since the estimated net undercount 
is less than 2 percent of the population, even a bad adjustment for 
it, wouldn’t cause big problems. Before you make that mistake, it 



96 


is important to consider the examples on pages 11 to 15 of my first 
paper. These pages demonstrate that the 1990 Post Enumeration 
Survey identified some undercount differentials of 10 percentage 
points, 20 percentage points, and more that turned out to be totally 
spurious. Now I want to be clear about what I mean by a difference 
of 20 percentage points. These examples don’t just involve inflating 
one group by 1 percent and another group by 1.2 percent; that 
would be a difference of 20 percent. If the difference should really 
be zero percent that could be a problem for some purposes. But 
that is not what I mean by a difference of 20 percentage points. 
These examples in my paper involve inflating a population group 
by, say, 8 percent and another group by 28 percent, when neither 
group has been undercounted more than the other. This is not a 
problem that only demographers would be concerned about. This 
problem is big enough to affect every user of census data. It’s clear 
that the Census Bureau’s method does not provide suitable meas- 
urements of undercount. In an effort to solve a net undercount of 
less than 2 percent, the reliability of the census would be utterly 
destroyed. This is a strong statement, but that does not mean that 
it is an overstatement. It would be very difficult to overstate the 
implications of having errors of this magnitude integrated with the 
census counts. 

Thank you, again, for the opportunity to testify this afternoon. 

[The prepared statement of Mr. Darga follows:] 



97 


Summary of Testimony on Census Undercount 
for the House Subcommittee on the Census 

Kenneth J. Darga, Senior Demographer 
Michigan Department of Management and Budget 
May 5, 1998 


I would like to thank Chairman Miller and all the members of the Subcommittee on the Census for 
inviting me to speak with you today about Census undercount adjustment. At this time 1 would 
like to submit two papers for the record which I will summarize briefly. 


The Fallacy of Undercount Adjustment 

It’s no surprise that the Census doesn’t count everybody. The Census has a hard time counting 
people who don’t trust the government or don't want the government to know where they are. 
The Census doesn't do a very good job counting homeless people either, and there are many other 
factors that make a complete count very difficult. 

So the Census Bureau tries to fix the problem by counting people in some neighborhoods a 
second time and comparing the results person-by-person with the Census. In 1990, this method 
seemed to find just about all the people in this sample of neighborhoods who were missed by the 
Census. 

This sounds great until you realize what’s really happening. The 1990 Post-Enumeration Survey 
didn’t really find all the people that were missed. People who didn't want to be counted the first 
time didn’t want to be counted the second time either, and the Post-Enumeration Survey didn’t 
even try to count homeless people. But it did find quite a few people who looked like they were 
missed by the Census when they really weren’t. In fact, most of the people that the Post- 
Enumeration Survey identified as missed by the Census really weren’t missed by the Census. 

That’s a surprising claim. How can you know that it’s true? 

There are at least two ways: a theoretical approach, and an empirical approach. 

Theoretical Verification 

First, a theoretical approach. On pages 6 through 9 of my first paper,* you will find a very 
simple and very basic statistical phenomenon that explains why serious problems are inevitable 
when you try to measure undercount with a coverage survey. These pages show that an effort to 
measure a small component of the population-such as people missed by the Census— is very 
sensitive even to extremely small sources of measurement error, and that the coverage survey has 
to contend with a lot of very large sources of measurement error. 

* Kenneth J. Darga, “Straining Out Gnats and Swallowing Camels: The Perils of Adjusting for Census UnderoounL” Submitted 
to the Subcommittee on the Census, House Committee on Government Reform and Oversight, May 5, 1998. 



98 


So it shouldn’t be surprising that the coverage survey identifies a lot of people as missed by the 
Census when they really weren’t. It would be a lot more surprising-unbelievable, in fact— if it 
didn’t. 

Empirical Verification 

You can also see the problems with the undercount adjustments by taking an empirical approach. 
The Census Bureau evaluated the 1990 PES quite extensively, and it did a very impressive job of 
documenting its shortcomings. I also want to acknowledge the important work of Dr. Leo 
Breiman of the University of California at Berkeley in evaluating the Census Bureau’s evaluations. 

My second paper** discusses six very serious sources of error that were documented by the 
Census Bureau: 

- survey matching error 

- fabrication of interviews 

- ambiguity or mis-reporting of usual residence 

- geocoding errors 

- unreliable interviews 

- unresolvable cases. 

And the Census Bureau didn’t document just a little bit of error. One thing that the theoretical 
approach and the empirical approach have in common is that they both demonstrate very large 
amounts of error in the Census Bureau’s adjustments for undercount. The adjustments based on 
the Post-Enumeration Survey reflect errors in measuring undercount even more than they reflect 
undercount itself. 


Impact on Census Data 

Now you might think that, since the estimated net undercount is less than two percent of the 
population, even a bad adjustment for it wouldn’t cause big problems. Before you make that 
mistake, it is important to consider the examples on pages 1 1-15 of my first paper.* These pages 
demonstrate that the 1990 PES identified some undercount differentials of 10 percentage points, 
20 percentage points, and more that turned out to be totally spurious. 

1 want to be clear about what I mean by a difference of 20 percentage points. These examples 
don’t just involve inflating one group by 1 % and another by 1 .2%. That would be a difference of 
20 percent. If the difference should really be 0 percent, that could be a problem for some 
purposes. But that is not what 1 mean by a difference of 20 percentage points. 


* Kenneth J. Darga, “Straining Out Gnats and Swallowing Camels: The Perils of Adjusting for Census Undercount.” Submitted 
to the Subcommittee on the Census, House Committee on Government Reform and Oversight, May 5, 1998. 

** Kenneth J. Darga, “Quantifying Measurement Error and Bias in the 1990 Undercount Estimates." Submitted to the 
Subcommittee on the Census, House Committee on Government Reform and Oversight, May 5, 1998. 



99 


These examples involve inflating one population group by 8% and another group by 28% when 
neither group has been undercounted more than the other. This is not a problem that only 
demographers would be concerned about: This problem is big enough to affect every user of 
Census data. 

In an effort to solve an undercount of less than 2%, the reliability of the Census would be utterly 
destroyed. This is a strong statement, but that does not mean it is an overstatement. It would be 
very difficult to overstate the implications of having errors of this magnitude integrated with the 
Census counts. 

Thank you for the opportunity to testify this afternoon. I would be happy to answer any 
questions you may have. 



100 


Mr. Miller. Thank you. Dr. Coffey. 

Mr. Coffey. Yes, thank you. I’m afraid the only title I have right 
now is the den leader of my local Cub Scout den. [Laughter.] 

But until last year, and for the last 17-18 years, I was the senior 
mathematical statistician in the Statistical Policy shop in the Of- 
fice of Management and Budget, and in fact, I’ve been a “math 
stat” in the Federal Government for over 30 years when I retired. 

I’d like to thank you, Mr. Chairman, and members of the sub- 
committee for the opportunity to comment on these census issues, 
and also, particularly, to thank the subcommittee staff for the 
many documents they provided, especially the extraordinary papers 
that Kenneth Darga has just introduced. 

I want to talk — if I have enough time — I want to talk about two 
things; definitely, talk about the first one 

Mr. Miller. Dr. Coffey, could you bring the mic a little clos- 
er — 

Mr. Coffey. Certainly. 

Mr. Miller [continuing]. For the transcriber, thank you. 

Mr. Coffey. The first is a remarkable report generated by the 
senior Census Bureau staff and a panel of experts called the Report 
of the Committee on Adjustment of Postcensal Estimates, or CAPE, 
for short. This was an analysis of the adjustment methodology 
known as Dual System Estimation, DSE. It was undertaken after 
the adjustment decision was made in 1991. While I have some res- 
ervations about the ground rules of the study, I believe it was an 
excellent piece of work by some outstanding professionals. 

Conceptually, the Dual System Estimation approach looked at 
what they called four cells, characterized by different mixes of 
matching and non-matching, or missing, records. Three of the cells 
really dealt with records that existed. The fourth cell consisted of 
the hypothetical cases that were missed by both systems, both the 
actual enumeration and the followup sample, or Post Enumeration 
Survey. 

Clearly, you can do a lot more with data than you can without 
it, and the committee did quite a lot with their analysis of those 
first three cells. As you heard earlier, it found some errors that ex- 
aggerated the original estimates of undercount by about 20 percent 
or so. Subsequently, it found that another 45 percent of what was 
left — now this is after the number had been deflated from 2.1 down 
to 1.6 percent — 45 percent of what was left was attributable to 
measurable bias. The report, itself, put it in even stronger terms. 
“Therefore, about 45 percent of the revised estimated undercount 
is actually measured bias and not measured undercount. In 7 of 
the 10 evaluations strata, 50 percent or more of the estimated 
undercount is bias.” This is from the Census Bureau, the CAPE re- 
port, page 15. 

That first bias was removable, and it was removed in the revised 
estimate. The Census Bureau’s expert panel urged them to attempt 
to remove the second, larger bias. But the Bureau determined that 
it could not be removed without risking even larger errors. 

At this stage of the evaluation, the expert panel and the commit- 
tee were asking the questions statisticians should always ask, “Are 
we measuring what we think we are measuring?” The answer pro- 
duced considerable discomfort, and the fact that the bias was inex- 



101 


tricably interwoven with the apparent undercount effects made 
matters worse. 

In theory, there can be offsetting unmeasurable bias, for exam- 
ple, what’s been called the correlation bias. This kind of thing in- 
volves assumptions that are not strictly satisfied and, particularly, 
the size of that fourth cell where there is no data — where you don’t 
have any data to infer from. But if you think about this situation 
you can begin to see that there’s a “Catch 22” here. 

For Dual System Estimation to work, the unobserved fourth cell 
must be small. If both the actual enumeration and the later sample 
miss a substantial proportion of the uncounted populations, then 
the DSE estimation process begins to unravel. The attributes of the 
measured portion, now small compared to the total undercount that 
you think might be there, can’t be attributed to this whole un- 
counted group without substantial risk of additional bias. On the 
other hand, if the fourth cell is small, then the offsetting bias is 
small, and one is left with a measured undercount about half the 
size implied by demographic analysis. 

We’ve seen numbers around the room here today — this one may 
not be correct, though it is footnoted with a correct footnote — demo- 
graphic analysis had it, at one point, 8 percent, I believe, in 1990. 
The original, official estimate would have put it at 2.1. This was 
clearly wrong and was later corrected down to 1.6. The 1.6 is net 
of these bias adjustments, rather the bias adjustments have not 
been made. What this says is that a big chunk of that 1.6 was bias, 
about half, and you start to get into logical difficulties if you try 
to, in fact, deal with the potential offsetting bias, also 
unmeasurable. You end up with a number that’s down around half 
the size of the 1.8 given by demographic analysis — about 0.9. 

Demographic analysis isn’t perfect, by any means, but I don’t 
think very many people would be comfortable with the idea that 
demographic analysis missed the undercount estimate by a factor 
of two or more, which is where this logic leaves you. 

In the report, the Census Bureau assumes a moderately small 
correlation bias which did not fully offset the measured bias and, 
thus, was equivalent to a measured undercount of about 1.2 per- 
cent, a third contender in the undercount and measurement run- 
ning. The remainder of the analysis put the assumptions and facts 
under a microscope. It was a very complex chain of reasoning. Un- 
fortunately, many, many of the results turned out to be inconclu- 
sive and worse yet, in some cases, it produced results that were im- 
possible when they tried to test the consistency of the facts and the 
assumptions. One of the biggest headaches was negative values in 
the fourth cell, which drew a lot of attention from the expert panel. 

One other interesting thing happened late in the review process. 
A committee member, suggested that they consider, quote “a com- 
posite 50-50 estimate which would be the simple average of the 
census count and the adjusted base.” 

After all the time they had spent on research and analysis, this 
simple “split-the-difference” idea didn’t sit too well with many of 
the committee members as you can imagine. On the other hand, 
quote “Analysis done by the committee members showed that hy- 
pothesis test results at the State level were much more favorable 
to the composite estimate than to the full adjustment, even without 



102 


including correlation bias.” Actually, this result is neither trivial 
nor surprising, and I touch on it a little bit in my written state- 
ment. 

The bottom line: this extraordinary effort in what I believe was 
a politically neutral environment — the tough decisions had already 
been made — based on a massive amount of data and a large vol- 
ume of additional research left profound doubts about the DSE 
methodology and its future. 

There was a prescient comment from a member of the expert 
panel who cautioned that he would not be surprised to see addi- 
tional research, after July 1992, turn up new results and new esti- 
mates of undercount. Now it is 8 years after 1990 census, and re- 
searchers are still finding significant new problems. By mid- 1992, 
about half of the DSE estimate was attributable to measured bias. 
The later research cited in Mr. Darga’s paper raised the figure to 
70 percent measurement error. I understand, and you heard ear- 
lier, that there are further papers that now put the split on the 
undercount estimate at 20 percent undercount and 80 percent 
error. I think anybody who is concerned with the accuracy of the 
census really needs to read and understand the mechanisms that 
are described in Ken Darga’s paper, and what kinds of con- 
sequences they produce. 

If I have time, I’d like to go into one additional item which con- 
cerns me, which is the interrelationship of this to the plan not to 
pursue the last 10 percent of the countable population during the 
actual enumeration. 

In my 32 years as a Government statistician, I’ve never found 
anyone willing to argue that truncating followup will improve qual- 
ity of data. Saving time or money is usually the issue, but not im- 
proving quality, and that’s the case here. The issue is not quality, 
but resources. What this is going to amount to, making this deci- 
sion not to pursue that last 10 percent of the population, with in- 
tensive followup operations such that have been used in prior cen- 
sus, is that it will expand the uncounted portion of the population 
by a factor of five or more, the factor depending on how much you 
believe previous estimates. 

That isn’t to say that some things won’t look better. I was just 
looking around the room here at some of these charts and thinking 
that some of these can now be retired to archives if we follow this 
plan, because you will not have these kinds of independent meas- 
ures that could be compared in this way under the 2000 plan. 

There will no longer be an independent demographic analysis es- 
timate of undercount that can be compared to prior censuses pro- 
ducing this kind of time series, because there won’t be an actual 
enumeration figure to compare with. The long time series of this 
single, most-trusted measure of undercount will be broken. On the 
brighter side, you won’t be able to answer or ask a lot of questions 
about the accuracy of the demographic analysis either, because a 
lot of the discrepancies that have allowed the assumptions of demo- 
graphic analysis to be tested and refined over many decades will 
no longer be visible. There will be so much sampling error, imputa- 
tion error, and bias to contend with, you won’t be able to see those 
things anymore. 



103 


DSE will probably look better than it did in 1990; you’d certainly 
hope so. A few million people attributable to bias doesn’t look so 
bad against a backdrop of 20-odd million uncounted people. But the 
bias will still be there. Adding a large chunk of more predictable, 
uncounted cases will make DSE look better, but it won’t reduce the 
kernel of tough, uncounted cases that really brought it to disaster 
in the 1990 census. 

Since major portions of the bias, in fact, arose from DSE oper- 
ations and procedures, some of those will scale right up with the 
larger version of DSE and will look like the artificially inflated 
total of the uncounted. On top of all this, the strategy for truncat- 
ing followup will add additional sampling error, imputation error, 
in millions of cases, where full followup would have produced accu- 
rate data. 

Some of my colleagues at OMB are going to have at me on this, 
but let me tell you, if Congress can’t find resources to intensively 
followup every citizen who can be convinced to participate in the 
census, it deserves the inaccurate census it will get. 

I thank the committee for the opportunity to express these views, 
and will be pleased to respond to questions. 

[The report referred to follows:] 



104 


r* 

1/3 

CM 

O 

o 

o 


ASSESSMENT OF ACCURACY OF ADJUSTED VERSUS UNADJUSTED 1990 CENSUS 
BASE FOR USE IN INTERCENSAL ESTIMATES 


REPORT OF THE COMMITTEE ON. A0JUSIBENT..0F -POSTCENSAL -ESTIMATES 
BUREAU OF THE CENSUS 
DEPARTMENT OF COMMERCE 

OiUGWAL AUGUST 7 * 1592 

RECOMMENDATION 


The Committee on Adjustment of Postcensal Estimates (with an ai.. - vj-m v. - w. i .i. . o ut 
referred to In this report as the Committee) Investigating potential census adjustment fo> 
intercensal population estimates concluded that on average, an adjustment to the 1990 bas< 
at the national and state levels for use in Intercensal estimates would lead to an 
Improvement In the accuracy .of the Intercensal estimates. (Attachment 1 contains a list 
of the members of the Committee.) This conclusion was based on a set of extensive 
research and analyses as well as Input from outside consultants. This outside technical 
advice Included a Panel of Experts whose work culminated In a day-long meeting with Censu: 
Bureau staff. (Attachment l contains a list of the Panel of Experts.) Under the auspice: 
of the Office. of Management and Budget (0MB), there also was consultation with other . 
Federal agencies, which are prime users of Intercensal estimates. 

In coming to Its conclusion, the Committee did not vote. Instead, there was an attempt t( 
reach consensus. The conclusion of the Comlttee was not unanimous, but the large 
majorlty'of the Coimlttee agreed with the finding. Since there was no- vote, this report 
does not contain a specific listing of minority opinions. Rather, a series of concerns 1: 
listed. There was general consensus on several key points. 


1 . This decision was separate and distinct from the June 1993 decision about 
whether to adjust the 1990 census for all uses. Making a decision about whether to 
adjust the full census Is quite different from deciding whether to adjust the base 
that Is used In mathematical algorithms to produce estimates of population at 
several points In the decade between censuses (Intercensal estimates). 


2 . The majority of the Committee concluded that on average, an adjusted state base 
would be more accurate than an unadjusted state base for use In Intercensal 
estimates, but the Committee recognized there Is not necessarily Improvement for 
each and every state base. In fact, the Committee was concerned about a few 
specific states where the evidence was inconsistent as to whether adjustment was 
making an Improvement. Even so, the Committee felt that overall there was 
improvement at the state level. 

3. States are an important political entity and the first. tier In most funding 
programs. Therefore, the Committee felt that every state or none of the states 
should be adjusted. Even though some states are smaller than several large cities, 
the Committee did not recommend adjusting selected cities or counties. . 

4 . For smaller areas (generally, areas of less than 100,000 population), some of th 
Comalttee judged that the use of an unadjusted base for the estimates was better 
than the use of an adjusted base. Other Comlttee members concluded there was no 
way to determine whether an adjusted or unadjusted base was more accurate. In the 
absence of data showing Improvement by adjustment, the Committee concluded that th( 
relative distribution of population by substate areas within each state was more 


RECEIVE!) 

Ft 8 2 7 1998 

BY: Vf’f 




105 


accurate using census counts than the comparable relative distribution using 
adjusted counts. w 

5. The Conmittee was quite concerned about adjusting some, but not all sjj£state y 
areas especially 1 since there was no way to determine the cutoff of whichtafreas to 
adjust and there had been no research on the effect of adjustment for a ph$1al set 
of substate areas. ' “ < 1 

The Committee's technical assessment was based on a massive amount of data. While there 
(as a re-examination of the information already collected in conjunction with the 
■valuation of the Post enumeration Survey (PES), the Committee relied mostly on a large 
olume of additional research conducted since July 1991. In performing this additional 
esearch the Census Bureau had more time so it could take full advantage of what it had 
earned from its analysis to date of the 1990 census and the PES. The Census Bureau also 
tad fewer constraints to use prespecified procedures compared to the process in 
conjunction with the July 1991 decision whether to adjust the 1990 census for which a 
court order required prespecified procedures. This additional research turned out to be 
extremely useful, not only for this decision, but for future surveys of all kinds, 
including those designed for potential adjustment. The Conmittee wants to acknowledge 
specifically the massive effort that the professional statistical staff at the Census 
Bureau put into this research. It was research of such quality that all those Involved 
should be rightly proud. The quality and usefulness of the. research also were noted by 
the set of outside experts that helped review Census Bureau research. 

A full description of this research is beyond the scope of this report, but a summary Is 
provided. There are, however, extensive minutes of the Committee meetings, which contain, 
as attachments, the major results of the additional research. The Conmittee would like to 
conmend David Whltford and Michael Batutls for preparing these excellent minutes. 

In addition to providing useful Information, this additional research detected some er^ffs 
and made some refinements to the levels of estimated undercount originally reported In the 
spring of 1991. These changes are summarized In the following table and described more 
fully later in the report. 



106 


3 


Population 

Group 

Estimated Undercount I 

June 1991 

July 1992 S 

Undercount 

Estimate 

Sampling 

Error 

Undercount 

Estimate 

Sampl Ing 
Error 

U.S. Total 

2.08* 

.18* 

1.58* 

.19* 

Black 

4.82 

.29 

4.43 

.51 

■BSSBSBHIIH 

3.08 

.47 

2.33 

1.35 

American Indian, 
Eskimo, or Aleut 

4.77 

1.04 

4.52 

1.22 


5.24 

.42 

4.96 

.73 


This report Is a summary of the process that led to the Committee's recommendation. 
Though the report concentrates on activities that took place late In the decision 
process, the report also covers several topics that were discussed throughout the 
year of deliberations by the Committee. Some readers of this report may desire 
further background on the Issue of undercount In a census and the efforts of' the 
CensusBureau to measure and potentially correct (adjust) for any such undercount. 
There are numerous documents that could be read for background. One good summary 
document Is the notice In the Federal Register concerning the decision of the 
Secretary of Commerce about whether to adjust the 1990 census (Reference: Federal 
Register, Volume 56, #140, Part III, pages 33582-33692) . The remainder of this 
report Is divided Into several sections. 


BACKGROUND - 
UNDERCOUNT 


BACKGROUND - 
ESTIMATES 


This section contains a description of coverage In the decennial 
census as well as the methods the Census Bureau uses to measure 
coverage. 

This section contains a description of why the Census Bureau 
undertook the task of examining whether to adjust Intercensal 
estimates as well as a very brief description of the estimates 
program and Its use. 


RESEARCH This section summarizes the additional research done since July 

1991. This research was the major foundation for the Committee's 
assessment. 


DECISION This section briefly describes the decision process of the 

Coimlttee as well as the Executive Staff. These final dlscussloi 
as well as the year long deliberations of the Committee will be 
key pieces of Input to the Director's decision. 

FUTURE This section contains a few general findings concerning the 

process of measuring undercount In the future. 



























107 


BACKGROUND ON UNDERCOUNT 

The issue facing the Committee was whether potential error in the PES and 
adjustment technology was at a sufficiently low level to recomtend the 
inclusion of results from the PES into intercensal estimates. The decennial 
census is also subject to error, and the PES tries to measure the net coverage 
error in the census. 

This section describes the operations of the 1990 PES to measure census 
coverage error and how these PES results might have been used for a potential 
adjustment of the 1990 census. This section is provided solely for 
background, so the section can be skipped for those already familiar with 
coverage error in a census as well as the Census Bureau's methods to measure 
coverage error by the PES and Demographic Analysis. 

Since the very first census, there have been problems in accurately counting 
every person living in the United States. The resulting undercount, or 
percentage of the population that is not counted by the census, is not a new 
phenomenon. Beginning with the 1940 census, each decennial census has 
Included an evaluation program to attempt to measure the extent of undercount, 
or what is often called coverage error. These evaluations' showed a steady 
improvement in net census coverage over four decades, from an estimated 
undercount of more than 5 percent for the total population in 1940 to an 
estimated undercount in 1980 of Just over 1 percent. They also have shown 
larger undercount rates for the Black population than the non-Black population 
and a differential that has stayed about 3-4 percentage points over the 
period. A difference in estimated undercount for one population subgroup 
(like Blacks) and another population subgroup (like non-Blacks) is called the 
differential undercount. 

Because of concern about this differential undercount, it was suggested that 
if the Census Bureau can estimate the number of people missed in a census, why 
not simply correct the census to account for missed persons and thereby make 
the census more accurate. This, In simple terms, is what Is called 
‘adjustment.’ But estimating the census undercount with acceptably small 
error and. In turn, using that knowledge to improve the census counts for all 
levels of geography are two highly complex and difficult tasks. 

The Census Bureau had two major programs to measure coverage in the 1990 
census. The first was the PES, which was a sample survey taken after the 
census. Approximately 165,000 housing units In a sample of 5,290 census 
blocks or block clusters were Interviewed. Block clusters are combinations of 
small blocks. For the rest of this report, block will be used to mean a block 
or a block cluster. Persons enumerated during the PES were also referred to 
as the P-sample. After persons In the housing units In the selected sample 
blocks were Interviewed, their responses were matched to census records In the 
same set of blocks to determine whether they were counted in the census. This 
process measured erroneous omissions In the census. 

The Census Bureau also measured erroneous Inclusions In the census by 
determining whether any of the persons in the PES sample blocks who were 
enumerated In the census should not have been counted or should not have been 



108 


5 


counted at that particular location. An erroneous census enumeration, for 
example, could have included a child born after April 1, 1990, a person who 
died before April 1, or a college student away from home who was enumerated at 
his or her parents' address, instead of being correctly enumerated at the 
college. Persons in this sample constitute the E-sample. 

The data on erroneous Inclusions and erroneous omissions were used to produce 
an estimate of the net undercount or net overcount of the population in the 
census. This was a very complex process that combined elements of survey 
design. Interviewing, matching, imputation, mathematical modeling and 
professional judgment. 

Second, the Census Bureau used a system called Demographic Analysis (DA) to 
also measure census coverage. Basically, in DA, an Independent estimate of 
the total population is produced by combining various sources of 
administrative data. This process Included using historical data on births, 
deaths, -and legal immigration; estimates of emigration and undocumented 
iimlgratlon; and Medicare data. 

Demographic analysis estimates were used to evaluate the reasonableness of the 
PES estimates. Only the PCS provided estimates of undercount and overcount at 
a level of detail suitable for use in potential adjustment. For example, 
demographic analysis estimates were produced only at the national level and 
for the Black and non-Black populations; the PES process was designed to 
measure coverage error for more population subgroups (Whites, Blacks, 
Hispanics, Aslans and Pacific Islanders, and American Indians) by detailed 
levels of geography. Therefore, only the PES data could permit an adjustment. 

Each of these programs will be summarized below. For a more detailed 
discussion of PES see Howard Hogan, ’The 1990 Post-Enumeration Survey: An 
Overview," a paper presented at the American Statistical Association in August 
1990; for a more detailed discussion of Demographic Analysis see 0. Gregory 
Robinson, "Plans for Estimating Coverage of the 1990 United States Census: 
Demographic Analysis," a paper presented to the Southern Demographic 
Association, in October, 1989. 


POST-ENUMERATION SURVEY (PES) 


Sample Oe'slgn 

The PES sample was selected In stages. First a random sample of blocks was 
drawn. Blocks are small polygons of land surrounded by visible features. 

Most are like the four-sided blocks in a city. Within the selected set of 
sample blocks, all housing units were listed. 

To select the sample of blocks, all blocks in the United States were assigned 
to one of 101 groups called strata. The strata were defined by geography, 
city size, racial composition, and percent of housing units that were renter 
occupied as opposed to owned. A representative sample of blocks was selected 
from each of the sampling strata. A separate sampling stratum was defined for 
American Indian Reservations. 



109 


6 


Persons living In institutions were excluded from the PES, as were military 
personnel living in barracks, people living in remote rural Alaska, and 
persons In emergency shelters and persons who had no formal shelter. 

Listing and Interviewing 

In February 1990, Census Bureau Interviewers who are part of the permanent 
Census Bureau staff of Interviewers visited each of the sample blocks to list 
all housing units. To preserve independence, none of the temporary 
enumerators hired to take the 1990 census was used for this listing operation 
and the listing operation was not conducted out of the temporary census 
offices. The reason for this was to make sure that temporary people taking 
the census did not know where a PES sample block was, because If they did, 
that block might be treated differently during the census. 

After the completion of the regular 1990 census Interviews, PES Interviewers 
Interviewed persons at households In the PES sample blocks. Although this 
Interviewing drew from Interviewers who had already worked on the 1990 census, 
steps were taken to preserve Independence, such as not allowing an Interviewer 
to work in a block In the PES that he or she had worked In' during the census. 

During the PES Interview, the Interviewers determined who was living In each 
housing unit, obtained their characteristics, and asked where they lived on 
April 1, 1990, Census Day. This latter question was necessary In order to 
determine whether those people who had moved since census day had been counted 
In the census. The PES Interviewing began nearly 3 months after Census Day. 

There was a quality assurance program for the Interviewing phase to ensure 
that the Interviewers really visited the household and that the people listed 
were Indeed real. If Interviewers made up people, they would not match to the 
census and would Inflate the undercount rate. 

Hatching 

The next step was to match the persons enumerated during the PES (the 
P-sample) to the census. Those persons In the P-sample matched to the census 
were considered to have been counted 1q the census; those nonmatched were 
considered to have been missed. 

Hatching was carried out In several stages. It Involved an Initial stage of 
computer matching followed by clerical matching to attempt to resolve cases 
that the computer could not match. Hany of the persons not matched to the 
census by computer and clerical matching were assigned for a follow-up 
Interview, If It was determined that additional Information might help 
establish whether a match to the census was appropriate. An additional stage 
of clerical matching was then conducted using the Information from the follow- 
up Interview. 

The E-sample, those persons In the PES blocks who were enumerated In the 
census, was examined to determine If they were correctly enumerated. E-sample 
persons were matched back Into the census to determine If they were enumerated 
more than once (duplicates). The E-sample persons who were not matched to the 



110 


7 


P-sample were potential candidates for erroneous enumerations. Some of these 
unmatched census persons were also Included in the PES follow-up operation 
described above. 

A final matching and reconciliation operation took place at the conclusion of 
the PES follow-up. An Important aspect of this operation was that situations 
arose where correct match status for persons In the P-sample, or correct 
enumeration status for persons In the E-sample, could not be determined. This 
situation occurred because the Initial Interview was Inconclusive or because 
an Incomplete Interview was obtained during the follow-up. 

Imputation and Dual System Estimates 

A final PES computer file was created that reflected the match status for 
persons In the P-sample and the enumeration status (correct or erroneous) for 
persons In the E-sample. Computer editing or Imputation was performed to 
correct,. Insofar as possible, for missing or contradictory data. A critical 
aspect of imputation Involved the estimation of a final match status for those 
persons whose match status could not otherwise be resolved. 

The data in the final PES file were then suimarlzed and Incorporated with data 
from the full census to produce dual system estimates (DSE's) of total 
population. Dual system refers to the fact that two systems (the census and 
the PES) are used to make the population estimate. The DSE's were produced 
separately for each of 1,392 unique subgroupings of the population called 
post-strata. (See the following section titled Post-strata) 

The DSE model to estimate total population conceptualized each person as 
either In or out of the census cross classified as either In or out of the 
PES. Essentially It Involves determining how many people were (1)- In the PES 
and In the census (matches), (2) In the PES and out of the census(Non-matches), 
(3) In the census but not In the PES, and (4) In neither the census or PES. 

To get an estimate of total population, you could add up the four cells listed 
above. But, only two of those were directly estimated (cell 1, matches, and 
cell 2, non-matches). Making some assumptions and using some basic algebra, 
total population can be estimated without direct estimates for each of the 
four cells. These operations and the DSE are explained more fully in the 
Hogan paper cited above. 

Post-Strata 

The Census Bureau prepared the dual system estimates of the total population 
for each of 1,392 groupings of people called post-strata. The reason for 
forming the post-strata was to group persons who had similar chances 
(probability) of being counted in the census. A person's likelihood of being 
counted In the census (or In the PES) Is called capture probability. The 
post-strata were defined by census division, geographic subdivisions such as 
central cities of large metropolitan statistical areas, whether the person was 
the owner or renter of the housing unit, race, age, and sex. Each person In 
the PES sample belonged In one of the unique post-strata. 



Ill 


o 


For purposes of Illustration, the following are examples of the 1,392 post- 
strata. One example is a post-stratum which contains Black males, age 20-29, 
living In rented housing In central cities in the New York primary 
metropolitan statistical area. A second example is that which contains non- 
Black non-Hlspanic females, age 45-64, living in owned or rented housing in a 
non-metropolitan place of 10,000 or more population In the Mountain Division. 

A third example is that which contains Asian males, age 45-64, living In owned 
or rented housing in metropolitan statistical areas but not in a central city 
in the Pacific Division. A fourth example is that which contains non-Black 
Hispanic females, age 30-44, living in owned or rented housing in central 
cities in the Los Angeles-Long Beach primary metropolitan statistical area or 
other central cities in metropolitan statistical areas in the Pacific region. 
As can be seen from these examples, the 1,392 post-strata are very specific. 

Adjustment Factors 

The next step In the process was to compare the estimated total population for 
each post-stratum (the dual system estimate or DSE) to the census count to 
determine a ‘raw* adjustment factor. For example, if the DSE for a particular 
post-stratum was 1,050,000 and the census count was 1,000,000, then the 
adjustment factor was 1.05, reflecting about a 5 percent estimated net 
undercount. Though most adjustment factors are larger than one, indicating an 
estimated undercount, an adjustment factor may be less than one, which would 
have the effect of lowering the census count for the post-stratum If an 
adjustment is applied. This situation results when there Is evidence of an 
overcount in the post-stratum. 

•Smoothing" the Adjustment Factors 

The next step was "smoothing* these "raw" adjustment factors to reduce 
sampling variance and to produce final adjustment factors. Because the PES 
was a sample, it was subject to sampling error. Sampling error Is the error 
associated with taking some of the population (a sample) rather than all of 
the population (a census). The process of smoothing the "raw" adjustment 
factors to create final adjustment factors was a step to minimize the effect 
of sampling error. Basically, smoothing is a regression prediction model. A 
multi-variate regression using items correlated with undercount predicts the 
undercount for each of the 1,392 post-strata. Then, the final adjustment 
factor is an average of the "raw" adjustment factor and the predicted 
adjustment factor. For a post-stratum with low estimated sampling error, 
there was heavy weight on the "raw" adjustment factor In the averaging, and 
vice versa. The smoothing technique was based on certain assumptions and 
would add an additional component of error called model error. The Census 
Bureau hoped that the reduction in sampling error from smoothing would offset 
any additional errors from the smoothing model chosen. If the Census Bureau 
had not used smoothing, the final adjustment factors for some of the post- 
strata would have been based on estimates of undercount that were subject to 
very large sampling error. 



112 


9 


Small Area Estimation 

The Census Bureau used the final adjustment factors to produce adjusted counts 
for every block In the Nation. The PES can only produce "direct* estimates of 
the total population for relatively large geographic areas (l.e., the 1,392 
post-strata). If there had been a decision to adjust, however, the adjustment 
would have been applied to each of the Nation's approximately 5 million 
populated blocks. The Census Bureau developed a model that took the 
adjustment factors produced for each of the 1,392 'post-strata areas and used 
them to estimate adjustment counts for each block. Since each of the post- 
strata contain many blocks parts, the Census Bureau based Its model on a 
critical assumption that coverage error Is similar for all blocks parts within 
a post-stratum. (A block part Is simply that part of the block that falls 
within the definition of a post-stratum. For example, females within a block 
would be part of a block and In one set of post-strata while males within a 
block would be In different set of post-strata.) This assumption of all block 
parts within a post-stratum being alike (homogenous) with regard to the chance 
of being counted Is analogous to the homogeneity assumption for persons. 

Finally, the Census Bureau produced a set of census tabulations with adjusted 
counts. It did this by adding or subtracting "adjustment* persons with 
detailed characteristics. The number of people added or subtracted was 
determined by final adjustment factor for the post-stratum that the block part 
was In. If someone had to be added, the Information from someone else' In the 
block part who was counted In the census was duplicated. If someone had to be 
subtracted, the Information for someone in the block part who was counted In 
the census was deleted. 

Evaluations 

The PES and adjustment process are based on many assumptions and have the 
potential for error. To evaluate the assumptions and potential error, the 
Census Bureau conducted numerous studies called P-studles because they 
referred to the PES. The studies. were associated with the following general 
areas. 

Hissing data on the PES questionnaire 

Mlsreportlng of census day address on the PES questionnaire 
Fabrication of data In the PES by Interviewers 
Errors In matching 

Errors In determining erroneous enumerations 
Balancing omissions with erroneous enumerations 

Correlation Bias (the tendency of the DSE to underestimate total population 
because some people are missed In both the PES and the Census) 

The homogeneity assumption 

The results of these evaluations are essential to determining whether adjusted 
or unadjusted census counts are more accurate. 



113 


. 10 


DEMOGRAPHIC ANALYSIS 

The Census Bureau's other coverage measurement program was demographic 
analysis (DA). DA uses historical data on births, deaths, and legal 
Inmigration; estimates of emigration and undocumented Immigration; and 
medicare data to develop an Independent estimate of the population. The DA 
estimate of population Is compared with the census count to yield another 
measure of net census coverage. DA can be only used to make reliable 
estimates at the national level. The DA coverage estimates were compared to 
the post-enumeration survey coverage estimates to assess the overall 
consistency of the two sets of estimates at the national level. 

Birth and death records are available for the entire United States from 1933 
on, but are not complete for years before 1933. Therefore, the Census Bureau 
had to find other ways to estimate the number of people who were born or died 
prior to 1933. In estimating births for each year, The Census Bureau added to 
the number of registered births an estimate of under-registration. Under- 
registration was estimated based on tests conducted In 1940, 1950, and 1964- 
1968. If the estimates of under-registration are off, they could have a 
significant effect on undercount estimates because birth data are by far the 
largest component In estimating the population through demographic analysis. 
Since national birth and death records are not available before 1933, the ■ 
Census Bureau had to find other ways to estimate the size of the population 55 
and older. For the population 65 and older, medicare estimates are used. For 
the population 55 to 64, estimates are made from revisions to earlier 
estimates. 

The United States does not keep emigration records. Therefore, an estimate 
had to be made of persons who have left the country. While the United States 
does have good records of legal Immigration, there Is no accurate estimate of 
Illegal Immigration. The Imnlgratlon and Naturalization Service now collects 
different Information than It did prior to 1980. That change further 
complicated the effort to estimate legal Immigration. Also recent legislative 
reform allowing amnesty also complicated the Issue since the Census Bureau did 
not know whether all of those obtaining amnesty actually reside In the United 
States. The Bureau used professional judgment to estimate the components of 
Illegal Immigration. 

It Is Important to emphasize that results of demographic analysis are not 
exact but are estimates. To a large extent, they were based on assumptions 
and best professional judgment. As In the PES, the Bureau tried to estimate 
potential error In the data produced by demographic analysis Ina series of 
studies call D-studles. Based on these studies, the Census Bureau developed a 
range of error around the demographic analysis estimates. 


UNDERCOUNT STEERING COMMITTEE 

To address the evaluation of the coverage In the census and the awthods used 
to evaluate that coverage (the PES and DA), the Census Bureau formed the 
Undercount Steering Committee (USC). Their work was an Important part of the 
July 1991 decision whether to adjust the full 1990 census for all uses. The 
work of the USC was also the major basis for the work done by CAPE. For a 



detailed description of the findings of USC, see Technical Assessment of the 
Accuracy of Unadjusted versus Adjusted 1990 Census Counts: Report of the 
Undercount Steering Committee, June 21, 1991. 



115 


12 


BACKGROUND ON INTERCENSAL ESTIMATES 

When the Secretary of Commerce announced his decision on July 15, 1991, not to 
adjust the 1990 census, he indicated his concern about the differential 
undercount. Because of that concern, he instructed the Census Bureau to 
continue its research into the area of potential adjustment. If the Census 
Bureau was able to resolve the technical problems associated with adjustment 
that were identified in the spring of 1991, then the Secretary asked the 
Census Bureau to consider incorporating results from the PES into the 
Intercensal estimates program. 

Basically, intercensal estimates are made by updating the most recent census 
base with estimates of population change (births, deaths, and net migration). 

Of course, the actual procedure Is much more complicated and sophisticated. 

The Census Bureau makes estimates at the national, state, and county level 
every year and at the Incorporated place (city) level every other year. These 
estimates have a variety of uses. Most notably, the estimates are used in 
funding allocations, as sample survey controls, and as denominators for many 
Important statistics. 

About one-third of the Federal funding programs use Intercensal estimates of 
population as part of their funding formula, rather than using the 1990 census 
count for ten years. There may be Items other than total population in the 
formula as well. The General Accounting Office has estimated that about 10 
billion federal dollars a year are allocated based on funding formulas that 
use Intercensal estimates'. States have within state fund-allocation 
programs as well. Many states use Intercensal estimates to allocate wlthln- 
state funding dollars. 

Many sample surveys use national, and to some extent state, Intercensal 
estimates as controls. The most notable Is the monthly unemployment survey 
(the Current Population Survey, or CPS). Sample surveys generally have poorer 
coverage than a census; therefore, in order to Improve the accuracy of 
estimates from a sample survey, the sample survey estimates are often 
controlled to an Independent total (in this case, the Intercensal estimate). 

Many Federal agencies produce statistics per 1,000 persons (or some other 
base). Examples are crime statistics, incidence of certain health conditions, 
etc. The numerator of these statistics can be obtained at various points In 
time throughout the decade. In the absence of any updated Information, 
calculating these kinds of statistics on a static 1990 denominator would be 
misleading; therefore, these Federal agencies use Intercensal estimates of 
population as the denominator. 

In order to be responsive to the Secretary's request on Intercensal estimates, 
the Census Bureau formed the Committee to address the technical Issues related 
to a potential adjustment of the base for Intercensal estimates. The 
Committee was made up of many people who also served on the Undercount 
Steering Committee for the Ouly 1991 decision. However, the Committee also 


'Federal Formula Programs - Outdated Population Data Used to Allocate 
Most Funds (GA0/HRD-90-145, September 1991). 



116 


13 


Included some new members. Including some Census Bureau staff very familiar 
with Intercensal estimates. Though the Committee focused on the technical 
Issues surrounding a potential adjustment, early In the Comralttee's 
deliberations, the Conalttee also had to make some key decisions related to 
the unique nature of the Intercensal estimates program. The Committee decided 
that: 


1. For the purpose of survey controls, there would-be only one decision 
point In the decade about whether to adjust Intercensal estimates. 

' 2. If there was a decision to adjust, there would have to be a mechanism 
to make the Intercensal estimates additive from the smallest area to the 
national total. 

3. There would not be adjustment for some uses of Intercensal estimates, 
but no adjustment for other uses of the estimates. 

4. If there were a decision to adjust, the amount of the adjustment would 
be calculated on the base population. This adjustment plus an estimate of 
population change for the time period since the census would be added to 
the unadjusted base. 

After every census, there Is a change in the base used to calculate the 
Intercensal estimates. Apart from the question of adjustment, there would be 
a change from a 1380 census base to a 1990 census base. For the use of 
estimates as survey control totals, that changeover date was postponed from 
January 1992 to January 1993. Therefore, 1992' estimates released In January 
1993 would reflect the 1990 base. The postponement was made so that the 
decision on whether to adjust the base for Intercensal estimates could be made 
at the same time. If there Is a decision to adjust, then the change to a 1990 
base and the change to a 1990 adjusted base would be simultaneous. If the 
decision Is not to adjust, then there will be a change to the 1990 unadjusted 
base. In that case, even If evidence later In the decade would lead one to 
support adjustment, the base would not be changed from 1990 unadjusted to 1990 
adjusted at a later point In the decade for the purpose of survey controls. 

Any change In base presents a discontinuity In uses based on Intercensal 
estimates. Federal agency users of Intercensal estimates for survey controls 
were quite clear that they strongly preferred only one such discontinuity 
during the decade. 

On a technical basis. It Is conceivable to be able to support adjustment at 
one level (say states), but not at lower levels. In such a case, state 
estimates would add to the national estimate, but substate estimates would not 
add to state estimates. There was agreement from users and from the staff 
making the estimates that failure to have additivity was not only undesirable, 
but close to unacceptable. Also, on a technical basis. It Is conceivable to 
be able to support adjustment for one purpose (for example, national survey 
controls), but not for another (for example, subnotions! fund allocation). 

The Committee found this situation undesirable. Finally, It is possible for 
the Census Bureau to decide not to adjust the base of estimates but for some 
Federal agencies to do their own adjustment. This topic was discussed among 
Federal agencies at a meeting at the 0MB. There was general agreement that It 


117 


14 


would be unacceptable to have variable sets of intercensal estimates used 
differently by different Federal agencies. 

Estimates start with a base population and add estimated population change 
(births, deaths, and net migration). If estimates are adjusted, an additional 
term would be added that represents the net adjustment level for each area. 
This net adjustment level is the difference between the adjusted base 
population and the unadjusted base population. In the estimation process, the 
sum of this net adjustment and the estimated population change would be added 
to the unadjusted population base. Under this procedure, the net adjustment 
would remain constant throughout the decade. 



118 


FURTHER RESEARCH 
THE BASIS FOR THE ASSESSMENT 


15 


When discussing the Issue of whether to adjust the 1990 census, almost all 
experts agreed that with more time, there would be refinements and changes to 
the estimated undercount. Most experts, however, assumed these changes would 
be relatively small. Since the July 1991 decision, the Census Bureau had the 
time and at the direction of the Secretary of Commerce, continued to examine 
the estimated undercount. As expected, the Census Bureau has made some 
refinements and changes. During this analysis, the Census Bureau discovered a 
significant computer processing error In the system used to determine the 
undercount estimates that were under consideration In spring 1991. As a 
result of an error In computer processing, the estimated national undercount 
rate of 2.1* was overstated by 0.4*. After correcting the computer error, the 
national level undercount was estimated to be about 1.7*. After making other 
refinements and corrections, the national undercount Is now estimated to be 
about 1.6*. Attachment 3 shows revised undercount estimates by selected age- 
sex-race categories. Attachment 4 shows revised undercount estimates by 
state. Attachment 11 shows revised undercount estimates for cities of 100,000 
or more population. Attachment 12 shows revised undercount' estimates for 
counties of 100,000 or more population. 

Since PES undercount estimates were based on a sample survey, they are subject 
to error. There Is sampling error to reflect the fact that the Information 
came from some and not all of the population. The estimates are also subject 
to biases. For example, errors In matching, erroneous responses from 
respondents, etc. can bias the undercount estimate. Just as for the estimate 
of undercount, the Census Bureau also refined its estimates of bias. The 
level of total bias, excluding correlation bias 1 , on the revised estimate of 
undercount Is negative 0.73 (-0.73*). Therefore, about 45* (0.73/1.58) of the 
revised estimated undercount Is actually measured bias and not measured 
undercount. In 7 of the 10 evaluation strata 3 , 50* or more of the estimated 
undercount Is bias. When correlation bias Is Included, these percentages go 
down. With correlation bias, the revised estimate of total bias Is negative 
0.35 percent (-0.35*). Including correlation bias, about 22* of the revised 
estimate of undercount Is actually bias and not measured undercount. ' In 
general, the Committee was concerned that the estimate of correlation bias 
could be an underestimate, which meant the total bias estimate of negative 
0.35* was an overstatement. There was limited time and methodology to 
Investigate this concern further. The Committee did not feel lack of more 
Information on this concern had an appreciable effect on their overall 
conclusion. 


Correlation bias Is a term that reflects the fact that the DSE of total 
population based on the PES is an underestimate for the model used by the 
Census Bureau. The DSE Is downwardly biased because of correlation bias which 
occurs, for example, because there are people missed In both the census and 
the PES. Correlation bias Is described more fully below In the section 
entitled Third Issue-Part B, p 21. 

3 See Attachment 6 for a description of evaluation post-strata. 



119 


16 


When the Committee began discussing the issue of whether to adjust the base 
for intercensal estimates, it started by reviewing the technical concerns 
raised about whether to adjust the 1990 census. This analysis produced a 
list of concerns, which the Conmittee summarized Into five key areas. 

1. Could the problems In the smoothing model, including lack of 
robustness, be resolved? 

2. Could the estimated biases In the PCS estimate of undercount be 
removed? 

3. Were all components of the bias adequately reflected in the total error 
model, and was total error being accurately handled in loss function 
analysis? 

4. Could we learn more about whether or not our homogeneity assumption 
held sufficiently to support adjustment? 

5. Could we resolve the inconsistencies between the PES and other 
estimates of undercoimt, primarily Demographic Analysis? 

There were other Issues raised. While it would have been helpful to research 
these other questions as well, the Conmittee felt comfortable in confining Its 
research efforts to the five key questions. The Cormittee felt they could 
make a reasoned choice about whether to adjust the base for Intercensal 
estimates if they got appropriate information on these five Issues. 


FIRST ISSUE: COULD PROBLEMS IN THE SMOOTHING MODEL BE RESOLVED? 

Summary: The Conmittee was very comfortable with the new post- 
stratification scheme which reduced sampling variance enough to avoid 
the use of smoothing. However, because of the limitations of 
artificial population analysis 4 , there was still some concern with 
the finding that there was no loss in homogeneity 3 in a smaller post- 
stratum design that had only about 25X as many post-strata. (See 
fourth issue.) 

For the July 1991 decision on whether to adjust the 1990 census, the sample 
of about 400,000 people was post-stratified into 1,392 groups. A' person 
could be In one and only one of the 1,392 post-stratum groupings. Some of 


‘Artificial Population Analysis refers to the study to examine If the 
persons within each of the 357 post-strata were alike (homogeneous) with 
regard to their probability of being counted in the census. Artificial 
Population Analysis is described below in the section entitled Forth Issue, 
p 25. 


: To make estimates from the PES, each sample person is assigned to one 
and only one post-stratum. A necessary assumption Is that every person within 
a post-stratum has approximately the same chance of being counted In the 
census or the PES. This assumption is called the homogeneity assumption. 



120 


17 

those post-stratum groupings were quite small so the estimate of undercount 
was subject to very high sampling variance. In order to reduce this 
sampling error, the Census Bureau used a technique called smoothing. 
Smoothing was a regression prediction model. Based on items correlated 
with undercount, the undercount for each of the 1,392 post-strata was 
predicted using the regression model. Then, the final undercount was an 
average of the predicted undercount and the directly observed undercount. 

The smoothing process was successful at reducing the sampling variance. 
However, there were several Issues raised about the entire smoothing 
process. It would have taken a large, Intense, and uncertain research 
prograai to have answered all of these concerns. Therefore, the Committee 
chose a different approach. The Committee agreed to reduce the number of 
post-strata. By doing so, each new post-stratum would have more sample 
size than under, the 1,392 system, and presumably, enough sample size so 
that the estimates would be stable (meaning the estimates would not have 
very large sampling variance); therefore, no smoothing would be required. 

It was expected that there would be some loss of homogeneity by going to a 
smaller post-stratum design, since with fewer strata, each stratum now had 
more people. Therefore, one could expect that it was less likely that 
everyone within these larger strata had the same Capture ' probability as In 
smaller strata. The Committee assumed that the loss in homogeneity would 
be smaller than the problems and potential error from smoothing. As It 
turned out, the Committee's assumption seemed to be correct. 

Based on measures of census performance and general patterns of undercount, 
a new set of 357 strata were designed. The 357 strata were not a simple 
regrouping of the 1,392 strata. The 357 strata design Included 51 main 
strata defined by geography, owner-renter, and race/Hispanic cross 
classified by 7 age groupings cross classified by male-female. Attachment 
5 contains a description of the 357 post-stratum design. This 357 design 
turned out to be a very effective stratification, primarily because we were 
able to examine additional data before defining the strata. Perhaps the 
most Important piece of Information for this examination was the strong 
relationship of living in owner or renter housing units to undercount. 
Hence, owner-renter status Is very prominent In the 357 design. 

Ue prepared revised PES estimates of undercount based on the 357 design and 
analyzed sampling variance by post-stratum. The Intent was to verify the 
assumption that the sampling variances under the smaller (357) design would 
be relatively stable. At the state level, the variances were at an 
acceptable level*. Attachment 10 contains revised estimates of undercount 
or overcount for the 51 main post-strata that were part of the 357 post- 
stratum design. 

The Coonlttee was also concerned with the potential loss of homogeneity 
with the smaller post-stratum design. Using artificial population 
analysis, the Committee examined the homogeneity of the 1,392 design 
compared to the 357 design. Artificial population analysis is described 
below In the section called Fourth Issue. Based on the artificial 


*C.A.P.E. minutes 4-6-92, Attachment 3. 



121 


18 ' 


population analysis assuming no bias in the PES, the Connlttee found the 
homogeneity for the 1,392 design and the 357 design to be about the sane 7 . 
This result at first seemed counter-intuitive since one would have expected 
some reduction in homogeneity. However, the result may be explained by the 
fact that the 357 design Is much more effective than the 1,392 design 
(probably true since the 357 design was based on a careful review of 
auxiliary data}, by limitations of the artificial population analysis, or 
by a combination of both those factors. 

In sunraary, the Committee was very comfortable with the new stratification. 
In general,, for state-level estimates, the Committee felt satisfied with 
the 357 design without smoothing versus the 1,392 design Including 
smoothing. However, because of the limitations of artificial population 
analysis, there was still some concern with the finding of no loss In 
homogeneity by going to a smaller post-stratum design that had only about 
25% as many post-strata. 


SECOND ISSUE: CAN ESTIMATED BIASES BE REMOVED FROM PES ESTIMATES? 

Summary: One of the first steps In further analysis of the PES was to 
re-examine the 104 blocks which had the greatest effect on the 
undercount. Many of the blocks had such a significant effect, they 
could be considered outliers. As a result of the examination of 104 
blocks*, corrections to the Post Enumeration Survey (PES) undercount 
estimates and bias removal were conducted. The net result was to 
reduce the estimated national net undercount by 0.1%. During that 
analysis, the Census Bureau also found and corrected a computer error 
that had Incorrectly overstated the 2.1% undercount reported In July 
1991 by .4%. The July 1991 estimate of undercount was reduced by 0.4% 
because of the computer error and an additional 0.1% because of 
modifications and bias removal resulting In a revised July 1992 
national PES estimate of undercount of about 1.6%. The Committee 
obviously was satisfied that the decision to do a review of 104 blocks 
led to the discovery of the computer processing error. The Committee 
was also confident that outlier blocks had been more appropriately 
handled. As for bias removal, the Committee had mixed feelings. They 
were pleased that the review of only 104 blocks had removed a 
relatively large amount of bias. But, a significant amount still 
remained. The Committee could find no reliable or expedient method to 
remove the balance of the bias from the. PES estimates. 

The PES estimates of undercount are subject to biases. The Census Bureau 
had many evaluation programs to try to measure the level of these biases. 

At the U.S. level for total population, the estimated bias was negative 
0.73% (or negative 0.35% If correlation bias is Included) on an estlMted 


7 C.A.P.E. minutes 4-6-92 Attachment 5 and C.A.P.E. minutes 3-9-92 
Attachment 1. 

*Small blocks were often combined to form block clusters. This report 
uses blocks to refer to blocks and block clusters. 



122 


19 


undercount of about 1.6X. If It was possible. It would be desirable to 
remove these biases before any potential adjustment since the PES estimate 
of undercount Including the bias Is an overstatement of the undercount the 
PES actually measured. At the U.S. level for total population, the bias 
could be removed. The Comslttee discussed the possibility of removing the 
bias at sub-national levels. The only alternative was a modeling approach. 
Considering the very snail samples used to estimate the biases and the 
difficulties of modeling, the Committee was very reluctant to try to remove 
the bias by modeling. The Committee was concerned that more error would be 
Introduced than the level of error we were trying to remove. A further 
complication was the concern that our estimate of correlation bias was 
conservative (see page IS). 

As a partial solution to bias removal, the Committee recomended an 
examination of the blocks that had the potential to contribute the most to 
the PES estimate of undercount. If the bias could be removed from these 
blocks, the PES estimates would be Improved. Of course, the results from 
this .set of blocks could not be generalized to other blocks, so any 
solution would only be a partial removal of the bias. 104 blocks were 
Included In the study. The study Is referred to by various names since 
additional components to the study were added over time.' This study was 
originally called OCR (Outlier Cluster Review) because of the intent to 
review the blocks that had outliers. When the study was expanded to a 
second purpose (removal of bias), the study was called Selective Cluster 
Review (SCR). 

During the SCR, several types of problems were examined. The treatment of 
outliers was reexamined and corrected as necessary. Some blocks had 
unusual results and had very big effects on the estimated undercount, 
effects far larger than one block should be expected to have. These are 
called outliers. They are similar to unusual marks by judges In athletic 
competitions. For the July 1991 estimates of undercount, there was a 
method to defuse the effect of these outliers. Now, with more time, we 
were able to reexamine these outliers and to use better methods (when 
applicable) to dampen their effect. 

In addition, during SCR, we looked for errors. An example is failure to 
search In the proper block. Searching for matching should have been done 
In the PES sample block as well as the ring of blocks surrounding the 
sample block. Generally, this was done. Sometimes errors were made and 
the matchers failed to look Into the entire ring. Mistakes like these were 
corrected. 

Matching, even In the proper set of blocks. Is error prone. Errors In 
matching can lead to a bias In the PES estimates. During SCR, expert 
matchers tried to remove all matching error and therefore any bias In the 
PES estimate due to matching. 



123 


ZV 


As a result of all aspects of SCR, the estimated national undercount was 
reduced by one-tenth of one percent (0.1%). The bias reduction only 
applied to the 104 blocks and could not be generalized to other blocks. 

The 104 blocks represent about 2* of the total sample while the 0.1* 
reduction on an estimated 0.7* total bias represents about a 14* reduction. 
Even though total bias could not be removed, these numbers show that the 
effort of redoing these 104 blocks was well worth It. The results of the 
SCR were also subtracted as appropriate from the total bias so that the 
resulting total bias only represents residual error for residual blocks 
(the total minus these 104 blocks). 

During the SCR, Census Bureau staff discovered a computer processing error 
that affected the estimates of undercount released In July 1991. Codes 
that were attached to- cases In clerical processing were Incorrectly fed 
into the. computer processing. Errors went In both directions (Increasing 
and decreasing the estimated undercount), but the net result of the error 
was to reduce the estimated national undercount of 2. IX by 0.4X. 


THIRD ISSUE: IS THE TOTAL ERROR MODEL COMPLETE? 

Summary: With regard to total error, the Coimlttee, was completely 
satisfied that all components of bias were represented. The Committee 
was concerned about the accuracy of some of the estimates of bias and 
the high variance for some estimates of bias. The general conclusion 
was to use caution In evaluating the results of loss function analysis 
since the target numbers In that analysis were so dependent on the 
levels of estimated bias. The Coimlttee felt that correlation bias 
should be a component of total error. However, there was concern 
about our method of estimating it and very serious concern about the 
method of allocating It to states, cities, etc. Since there did not 
appear to be methods or time to analyze this allocation Issue further, 
the Committee requested that loss function analysis be done with and 
without correlation bias. There was a choice of various loss 
functions. Primarily, the Committee concentrated on loss functions 
that examined proportionate population shares and not population 
counts. In addition. In general, the Committee considered loss 
functions based on squared error not absolute error. Using hypothesis 
tests with 10* significance, loss function analysis excluding 
correlation bias does not support adjustment. Using hypothesis tests 
with 10* significance and including correlation bias, all but one of 
the loss function analyses favors adjustment at the state level when 
examining aggregate loss. The Committee tended to accept these 
findings keeping In mind the numerous caveats. As a result of some 
comments from the Panel of Experts, the Committee was concerned about 
whether the significance level they used for the hypothesis tests was 
appropriate. 


’•Post Census Rematching for the Outlier Cluster Review,* Howard Hogan, 
undated; C.A.P.E. minutes 6-11-92 Attachment 1,2; C.A.P.E. minutes 4-20-92 
Attachment 2. 



124 


THIRD ISSUE-PART A: TOTAL ERROR 

The third major concern was whether the total error model contained all 
components of error and whether the components of error were adequately 
measured. In terms of whether all components of error were considered, two 
new components were added — error due to cases done very late In the 
regular census (called late-late returns) and treatment of out-of-scope 
cases. The Committee felt completely confident that all components of 
error had been listed and considered. 

The Committee could come to no agreement about the adequacy of the level of 
error measured for each of these components. There were concerns that 
matching error was determined by a dependent study and not an Independent 
study. There were concerns that evaluation Interviews used to determine 
the quality of the PES were conducted In February 1991, ten months after 
the census. There was concern that the estimate of only 13 fabrications In 
a sample of 150,000 seemed low compared to reasonable expectations. - The 
Committee Strongly agreed that the evaluation sample sizes were too small. 
The sampling error on several of the estimates of bias was extremely high. 

In summary, with regard to total error, the Committee was satisfied that 
all components of error were represented. The Committee was concerned about 
the accuracy and variance of the estimates of bias, but there was really 
nothing that could be done. The .general conclusion was to use caution In 
evaluating the results of loss function analysis since the target numbers 
In that analysis were so dependent on the levels of estimated bias. 
Attachment 6 contains estimates of the bias. 


THIRD ISSUE-PART B: CORRELATION BIAS 

The Committee spent a good deal of time discussing one aspect of total 
bias— correlation bias 10 . The Dual System Estimate (DSE) of total 
population produced by comparing the PES and the census is a biased 
estimate. It Is biased because of matching error, etc. These components 
of bias are described Immediately above. 

The DSE can also be biased by correlation bias which has multiple 
components. The first Is that the DSE assumes that a person's 
participation in thq PES is not, affected by his or her participation In the 
census (the causal independence* assumption). Failure of this assumption 
can cause a bias. Generally lack of Independence Is not considered to be a 
big problem since the PES is conducted almost 4 months after the census and 
because of other controls Introduced Into the PES system. 

The second component of correlation bias occurs because of variable capture 
probabilities within a post-stratum. The DSE does not require that the 
census and the PES have the same probability of counting people (called 
capture probability). But, the DSE does assume that within a post-stratum. 


<a Somet1mes, model bias is used synonymously with correlation bias. In 
this report, correlation bias will be used. 



125 


22 

everyone in the PES (or everyone In the census) has approximately the same 
capture probability. So, for example, a white male renter age 30-49 in 
rural areas of Louisiana is assumed to be just as likely to be counted as a 
white male renter age 30-49 in rural Mississippi, etc. Generally, if 
people within a post-stratum have differing capture probabilities, then the 
DSE is downwardly biased. That means the DSE underestimates the total 
population and in most cases would underestimate the undercount. 

As a special case of variable capture probabilities, assume within a post- 
stratum there Is a set of people with zero probability of being captured. 
These are often called the impossible to Count or people missed in both the 
census and the PES. They are another component of correlation bias. 

There are no direct estimates of either of these components of correlation 
bias, but an estimate for the total of both combined Is obtained by 
comparing PES estimates to Demographic Analysis (DA) estimates. To 
estimate the'levkl of correlation bias, the assumption Is- that sex ratios 
as determined by DA are accurate. Then, since In general the DSE estimates 
of males are lower than the DA estimates of males, there Is a calculation 
of how many males would have to be added to the DSE to make the PES sex 
ratio equal to the DA sex ratio. These added males are -an estimate of the 
level of correlation bias in the PES. 

Actually, after estimating the extent of correlation bias. It is not added 
to the DSE of total population (Just as other estimates of bias are not 
subtracted). Rather, the estimate of correlation bias Is added to the total 
error model and Is used to determine target numbers for loss function 
analysis. 

The Committee was concerned about the combination of the two components of 
correlation bias, but there did not appear to be any alternative. The 
Panel of Experts expressed the same sentiment. They agreed that they were 
uncomfortable with the combination, but there- does not seem to be an easy 
alternative. The Committee also was concerned that the PES measures more 
females than DA so that this method of estimating correlation bias should 
have had the effect of estimating a true population (for loss function 
analysis target numbers) that was bigger than total population In DA. 
However, the sum of the target populations ’did not equal the sum of the PES 
estimate and the level of correlation bias that was estimated to be added, 
as It should have. Jhere was no time to examine these concerns further. 
Finally, there was concern that the method used for comparing the DSE with 
bias. to DA understated the estimate of people missed due to correlation 
bias. 

Mostly, however, the Committee was concerned with the method of allocating 
the correlation bias. Basically, the estimated missing people due to all 
types of correlation bias (all males) are allocated back to each post 
stratum proportional to the estimate of the number of males In the fourth 
cell of the DSE for the post-stratum. Further modeling Is used to allocate 
the total error down to sub post-stratum levels. 

The fourth cell In the DSE Is an estimate of the number of people missed In 
both the PES and the census, but It Is a biased estimate because of 



126 


23 


correlation bias. It Is not directly estivated, but an estimate can be 
obtained by subtraction. Some of the numbers used in the subtraction are 
sample estimates, therefore, they are subject to sampling variability. The 
fourth cell is expected to be the product of the true population times one 
minus the capture probability of the PES times one minus the capture 
probability for the census. In theory, this number cannot be negative. 

But, in practice, due to sample variability, matching error, etc., it can 
be estimated to be negative. When the estimate in the fourth cell is 
negative, no amount of the estimated people missed due to correlation bias 
is allocated to that post-stratum. 

Both the Committee and the Panel of Experts were very concerned about the 
negative values in the fourth cell. The Panel of Experts suggested some 
methods to change the DSE process to avoid negative values. There was also 
considerable concern about using the fourth cell as the basis for 
allocation of the estimate of people missed due to correlation bias. In 
fact, other methods .of allocation had been tried by the Census Bureau. 

In summary, the Committee felt that correlation bias should be a component 
of total error. However, there was concern about our method of estimating 
It and very serious concern about the method of allocating it. Therefore, 
the Committee requested that loss function analysis be done with and 
without correlation bias. Each Committee member would then have to make 
some judgements about how to analyze the results. 


THIRD ISSUE-PART C: LOSS FUNCTION ANALYSIS 

Estimates of bias In the PES estimates of undercount are useful for 
Interpreting the accuracy of the PES estimates. But, estimates of bias 
were also a key component In a sumary analysis called loss function 
analysis. If truth were known, the census count and the adjusted base 
count could be compared to truth and an appropriate choice could be made. 
That of course Is Impassible. To approximate that comparison, the Census 
Bureau performed loss function analysis. 

As a first step In loss function analysis, the true population Is 
estimated. This estimate Is called the target population. ,It Is estimated 
by taking the PES estimate of population and modifying that estimate based 
on the estimates of error in the PES (the components of bias from the total 
error model). These estimates of bias are also subject to error, so you 
can't simply subtract bias from the PES estimate and assume that Is the 
true population. A further complication Is that estimates of bias are only 
available for 10 evaluation post-strata and target numbers are needed for 
every state, every county, every place, etc. A modeling system Is used to 
allocate the bias from the 10 evaluation post-strata to sub-levels of 
geography. Once target numbers are calculated, there Is a comparison to 
see whether census counts or adjusted counts are closer to the target 
numbers, which are assumed to be ’truth." There Is still an issue of what 
is the appropriate comparison between census, adjusted and target numbers. 
Should It be a simple difference? If so, how are pluses and minuses 
handled? Should It be the square of the differences, which avoids the 
problem of pluses and minuses but overemphasizes states (or other areas of 



Interest) with big differences. Or should It be some kind of weighted 
squared difference to avoid the over-effect of big states but to still 
reflect some of the differences in state size? 

The Committee could come to.no consensus on these difficult questions. 
Therefore, the Conmittee ran a variety of loss functions. These were a 
combination of: 

-Various methods of allocating the bias to target numbers 

-Mith and without correlation bias 

-Absolute and squared error as well as variations of those to take 

account of variation In state (or other area of Interest) sl 2 e. 

Even with these various loss functions, there was still another Important 
question. Do you only look at the aggregate loss over all areas of 
interest (example, all states), or do you look at Individual losses? This 
question was discussed with the Panel of Experts. The Panel felt that a 
simple count of ’winners* and 'losers* was Inappropriate. One suggestion 
was to use a Pitman nearness measure. Time prevented that kind of 
analysis. In the absence of this measure, the Committee continued Its 
original Intent to examine aggregate loss. The Panel supported analysis of 
aggregate loss. In doing aggregate loss analysis, the Committee heeded the 
advice of the Panel of Experts who strongly recommended that loss function 
analysis be viewed only as a tool and not an exact decision mechanism. 

In examining total loss over a set of areas (like all states), there was a 
question about whether the difference In aggregate loss between the census 
and adjusted base counts was a real difference or only due to random error. 
The Census Bureau had developed a statistical hypothesis test to try to 
answer that question. The Panel of Experts reviewed this work as well. In 
particular, the representative from Statistics Canada, who face the same 
problem, commented on the proposed hypothesis test. That expert warned 
that In effect we were not doing a standard hypothesis test, but rather we 
would be making a decision on which set of estimates to use based on the 
results of the test. If we continued with the standard test, we could be 
making mistakes about what level of significance to use. The most 
appropriate level might very well be larger than the 10% level of 
significance the Committee chose to use. Because - of the lateness of the 
suggestion, time prevented us from completely examining the alternative 
hypothesis test approach. Hence, the Committee used, with caution, the 
significance level of standard hypothesis test results. 

In summary, using hypothesis tests with 10% significance, loss function 
analysis excluding correlation bias does not support adjustment. Using 
hypothesis tests with 10% significance and Including correlation bias, all 
but one of the loss function analyses favors adjustment at the state level 



128 


26 

Various types of loss function analyses were used to compare the estimated 
scaled surrogate variables with the actual scaled surrogate variables. If 
the loss from the estimate was small you could assume that the post- 
stratification was good and the homogeneity assumption was holding. If the 
loss was large, there would be cause for concern. In addition, we could 
examine the number of places (states, cities, etc.) ‘Improved’ by 
adjustment. We could do this kind of analysis for surrogate variables 
since we know truth (the actual value of the surrogate variable). 

Based on artificial population analysis, a first analysis showed similar 
homogeneity for the 1,392 design as well as the 3S7 design as well as for a 
design with only 2 strata. Further analysis showed two problems.' One, the 
surrogate variables did not vary much by post-stratum. Since the 
assumption was that undercount did vary by post-stratum, there was concern 
about whether this set of surrogate variables was a good set. Another 
concern was that the analysis assumed no bias In the surrogate variable 
estimates and the PES estimates of undercount are biased. Therefore, there 
was an attempt to find additional surrogate variables as well as to 
Introduce bias Into the artificial population analysis. Artificial 
population analysis was rerun with various levels of constant bias added. 

The bias In the PES Is not constant, but there was no adequate way to 
Introduce variable bias Into the artificial population analysis. 

The original five surrogate variables were: 

-Allocation Rate (The rate at which questions without answers on the census 
questionnaire had to be allocated a response) 

-Percent of population covered by the mall census procedure 
-Percent enumerated by mall (mall return rate) 

-Substitution rate (The rate at which an entire person's census 
characteristics had to be created by a computer algorithm) 

-Percent of housing units that were multi-unit 

The three additional Items were: 

-Percent In poverty 
-Percent unemployed 
-A mobility statistic 

For states and most large geographic areas, without any bias, artificial 
population analysis supported the homogeneity assumption assuming that the 
surrogate variables act like undercount. Once bias Is Introduced, however, 
the artificial population analysis shows less and less homogeneity. When 
bias Is 2SX of the estimate, the artificial population analysis Indicates 
that there Is serious concern that the homogeneity assumption does not 
hold. Currently, with correlation bias Included, the bias In the PES 
estimate of undercount Is 22X. Without correlation bias, the bias Is 45X 
of the estimate. In summary, the Committee could only support the 
homogeneity assumption with some concern since the level of bias In the PES 
was close to the point where artificial population analysis shows the 
homogeneity assumption falls to hold. 



129 


25 


when examining aggregate loss 1 '. The Committee tended to accept these 
findings keeping in mind the numerous caveats mentioned above. 


FOURTH ISSUE: DOES THE HOMOGENEITY ASSUMPTION HOLD? 

Summary: Just as in July 1991, the results on whether the homogeneity 
assumption holds are inconclusive. The new research used to examine 
the homogeneity assumption (called artificial population analysis) 
indicates that the assumption does .not hold when the bias In the 
estimate gets to be about 25* or higher. Since the bias In the Post 
Enumeration Survey (PES) estimate as currently measured is 22* to 45X, 
the Committee was concerned. 

An Integral part of the PES/DSE system is to assume that everyone within a 
post-stratum has approximately the same probability of being counted In the 
PES. .This Is often referred to as having the same 'capture probability.* 

As discussed In the part of the third Issue having to do with correlation 
bias, failure of this assumption leads to a bias in the DSE. It Is also 
Important because of the way the sample Is selected and used to make 
estimates for states, cities, etc. Very few political units. Including 
states, have direct estimates from the PES. That Is, the state (or city) 
was not defined as a universe, and then a sample drawn from It to represent 
it. Rather, the sample was drawn by region, type of area (large urban 
area, other urban, rural), race, etc. Therefore, a sample case In 
Tennessee (for example) also is used in the estimate of undercount for 
Florida, Georgia, etc. This approach assumes homogeneity. Recognizing the 
Importance of this assumption, the Census Bureau designed a study (labeled 
P-12) to analyze whether the homogeneity assumption held. The results of 
P-12 were mixed or inconclusive. 

Recognizing this, the Committee asked for more extensive research Into the 
Issue of homogeneity. The new research was called artificial population 
analysis. Basically, items felt to be correlated with undercount were 
selected. They were called surrogate variables. These Items were then 
scaled to the level of the undercount. For example, the mall return rate 
of census questionnaires was one of these items. The mall return rate was 
about 65* while undercount was about Z*. The 65* was scaled to 2*. Then 
an area that had a mail return rate 5X greater than the national average, 
got a scaled mall return rate 5* above the national average. 

Ve know mail return rates for every area in the country. Using the same 
process used to estimate OSE's we estimated this scaled mall return rate. 

In effect, the comparison of the estimated scaled mall return rate to the 
known scaled mail return rate substitutes for the comparison of estimated 
undercount with known undercount. 


"Summaries of loss function analysis results can be found In the 
following C.A.P.E. minutes: C.A.P.E. minutes 5-4-92 Attachment 4 ; C.A.P.E. 
minutes 6-1-92 Attachments 9-11; C.A.P.E. minutes 6-9-92 Attachment 5; 
C.A.P.E. minutes 7-6-92 Attachments 2,3. 



130 


27 


FIFTH ISSUE: CAN THE INCONSISTENCY OF PES AND OTHER ESTIMATES BE 
EXPLAINED? 

Summary; Even though there were some points of concern, the Committee 
is much more comfortable with the consistency of the revised Post 
Enumeration Survey (PES) estimates and Demographic Analysis (DA) than 
they were with the July 1991 PES estimates and DA. At the state level, 
the Committee generally felt the revised PES estimates met their face 
validity expectations with some Individual state exceptions. 


As part of the July 1991 decision whether to adjust the 1990 census, there 
were many concerns about the PES estimates compared to other estimates, 
mainly Demographic Analysis (DA). In particular, there was concern that 
the PES estimated a higher population than DA and the fact that the PES 
estimated about a million more woman than DA. In addition, PES estimates 
were, compared to 'best professional Judgement* estimates, mainly to see If 
undercount was being measured by the PES In areas where undercount was 
expected. This check was called face validity. Face validity checks, 
though not rigorous. Indicated some areas of concern In the PES estimates. 
For these reasons, the Committee requested additional research to try to 
Investigate the apparent differences. 

With regard to OA, the revised PES estimates are now much more consistent. 
Attachment 7 contains a table summarizing the comparisons. The PES 
estimate of total population was now lower than the DA estimate, a more 
expected outcome. The estimated undercount from the PES at the national 
level was 1.6X compared to an estimate of 1.8X from DA. The PES estimate 
of women remained higher than OA (an unexpected result), but the difference 
has been reduced from one million to about 400,000 and was within sampling 
error. As expected, the PES estimates for Blacks (and In particular, young 
Black males) were much lower than the DA estimates. This Is a result of 
correlation bias. Even though expected, the Committee was concerned about 
this problem because there was no method to adequately add these people 
back Into PES estimates. 

With regard to face validity checks, there also was now more consistency. 
Almost all of the changes between the revised PES and the July 1991 PES 
estimates were- In the direction expected by the Committee. 

Since Intercensal estimates of states are of such Importance, the Committee 
asked for an analysis of revised PES state estimates compared with other 
Information on states to see If there was consistency. Basically, there 
was consistency with a few exceptions. The exceptions were substantiated 
by an Independent analysis done by one of the Panel of Experts. The 
Committee was concerned about these exceptions, therefore, they could only 
conclude that, on average, there would be an Improvement using adjusted 
base counts for states. 

In summary, even though there were some points of concern, the Committee 
was much more comfortable with the consistency of the revised PES estimates 
and DA than they were with the July 1991 PES estimates and DA. At the 



131 


28 

state level, the Committee generally felt the revised PES estimates met 
their face validity expectations with some exceptions. 



132 


29 


THE DECISION PROCESS 

The decision process that led to the assessment of the Committee contained 
many parts. By far, the largest part was the year of extensive research and 
discussion between the Conmlttee and the statistical staff at the Census 
Bureau. That part of the decision process Is summarized In this report and 
recorded In far more detail in the minutes of the Committee. The decision 
process culminated with three key discussions. These were a day long meeting 
with the Panel of Experts, a decision discussion meeting with the Committee, 
and a decision discussion meeting with the Executive Staff of the Census 
Bureau. This section of the report summarizes those three meetings. 

MEETING WITH PANEl OF EXPERTS: 

The Census Bureau wanted to have outside review of the additional research 
It had done since July 1991. The Census Bureau wanted to Include some 
Panel members who had not been too Involved In the July 1991 decision In 
order to get a fresh look. In addition, the Census Bureau considered the 
outside expert advice It obtained In conjunction with the July 1991 
decision. The Panel of. Experts was sent materials in advance. In 
addition, each member was asked to chose two of five key areas on which to 
concentrate his or her attention. They were, of course, free to comment on 
any other Issue, and as expected, they did. The meeting with the Panel was 
held on July 14, 1992. In order to place this summary of the Panel meeting 
In proper context. It Is Important to understand that the agenda for the 
Panel was restricted to major problems and that the Census Bureau 
specifically requested critical review. 

In sunn ary, the Panel made comments on the following key points: 

1. The Panel thought the additional research done by the Census Bureau 
was extremely thorough and useful. The Panel took the time to commend 
the Census Bureau for this effort. They felt this research took the 
Census Bureau a long way towards being able to adjust at some time, even 
If not fully at the present. 

2. The Panel thought the Census Bureau should only adjust for the 
geographic areas for which It was comfortable supporting the decision on 
technical grounds. Even then, there were bound to be some areas that 
were adversely affected by an adjustment or no adjustment, even though 
most were Improved. The Panel urged the Census Bureau to examine the 
exceptions and see If they were "seriously" hurt. If so, the Panel 
recommended the Census Bureau reconsider an adjustment, even If It was 
technically defensible on average. For areas below the level for which 
there Is technical backing to support adjustment, the decision about 
whether to adjust was more of a policy Issue. The Panel did point out 
that errors In estimates of population change from the census year to 
the year of Interest could be large, and perhaps larger than errors from 
adjustment, particularly for small areas. 

3. The Panel cautioned that many of the statistical analyses used by 
the Census Bureau (Loss Function, Total Error Model, etc.) were just 
tools and not exact decision mechanisms. 



133 


30 ' 


4. The Panel would have felt more comfortable If the bias could be 
removed from the PES estimates before their use In any potential 
adjustment. The Census Bureau agreed with the concern of the Panel but 
knew of no adequate methodology to remove the bias by state, city, etc. 

In addition, the Panel expressed some concerns: 

1. The Panel was quite concerned about the negative values In the 
fourth cell. The Panel suggested ways to alter the DSE process In order 
to avoid the negative values. 

2. While the Panel recognized the need to do something about 
correlation bias, they also recognized the potential problems caused by 
the Inability to estimate the components of the bias separately. The 
Panel was also concerned about the problems with the proposed allocation 
scheme. 

3. The Panel cautioned against loss function analysis where winners and 
ldsers were tallied up. Instead, If the Intent Is to examine Individual 
losses/gains, the Panel recommended a Pitman nearness measure be used. 

4. The Panel cautioned against too much reliance on the significance 
level In the hypothesis test the Census Bureau was planning to use and 
urged the Census Bureau to consider the Implications of the approach to 
hypothesis testing being studied by Statistics Canada. 

5. The Panel cautioned that artificial population analysis, like the P- 
12 study, was Inconclusive about whether the homogeneity assumption 
held. 

6. Some Panel members expressed concern about the extensive use of 
synthetic estimation In the adjustment process. (Examples: allocating 
undercount estimates to areas below which there were direct estimates, 
allocating bias, etc.) 

Attachment 8 contains more detail from the meeting with the Panel of 
Experts. 



134 


31 


C.A.P.E. DECISION DISCUSSION 

In July 22, 1992, the Conmittee net with the Director to discuss each member's 
opinion about the accuracy of adjusted base counts for use In Intercensal 
estimates. Prior to the main part of the meeting, one of the Committee 
members made a suggestion based on some analysis he had performed. He 
reconmended the Committee consider a composite (50-50) estimate which would be 
the simple average of the census count and the adjusted base. The reasoning 
for the suggestion was that we have two estimates of population, both with 
error. Despite massive research, It Is still Inconclusive about which Is 
better overall, for all levels of geography. Therefore, an average of the two 
might make sense. There Is precedent for this kind of averaging In other 
Census Bureau work. Despite the lateness of the suggestion, the Committee 
members were asked to comment on the new proposal. 

To help In the overall discussion about whether to adjust the base for 
Intercensal estimates, there was a list of key uses and Issues of Intercensal 
estimates. Committee members were asked to tie their opinions about potential 
improved accuracy to the uses of the estimates and geographic level. The list 
Is shown In Attchment 9. 

Each Conmittee member expressed his or her opinion about whether or not the 
base for intercensal estimates should be adjusted. Though not unanimous, most 
of the Committee members felt that adjustment of the base should be done at 
the national and state level. For national and state uses of Intercensal 
estimates, most Committee members felt adjusting the base would make the 
eventual estimates better on average. There was considerable concern about 
the states for which It was uncertain whether adjustment would make an 
Improvement. Below the state level, the Comlttee could not make a 
recommendation about improvement from adjustment and supported the census 
counts. In terms of the Issue of differential undercount and perception of 
fairness, the Committee strongly felt that adjustment at the state and 
national level would satisfy that element. The Conmittee could come to no 
agreement on whether an adjustment to the base would Improve overall accuracy 
(accuracy at all levels of geography). 

In addition to those summary findings, some other points were raised. These 
included: 

1. No matter what the decision, the Census Bureau needed to examine the 
existing Intercensal estimate challenge system’*. Regardless of the 
Census Bureau decision on adjusting the base, a political jurisdiction who 
feels It was harmed by the Census Bureau decision can and will challenge. 

2. Could we adopt the system used In Australia and perhaps Canada? The 
census Is not adjusted, but Intercensal estimates are. 


^Currently, there Is a challenge system In place that allows 
jurisdictions to question their Intercensal estimates. The evidence supplied 
by the Jurisdiction Is reviewed by Census Bureau staff. The staff selected are 
not Involved In the Intercensal estimate operations. If the challenge Is 
accepted, the Intercensal estimate Is changed. 



135 


3C 

3. No matter what the decision on adjustment of the base for Intercensal 
estimates, the reliance on the current DSE system should be examined. Some 
of the problems with It might never be solved. {See the final section of 
this report-FUTURE) 

The meeting closed with a discussion of the 50-50 composite suggestion. Only 
a minority of the Committee favored the 50-50 composite as a first choice, 
although many of the Committee members thought the composite could be a 
possible acceptable alternative. During the discussion, several pros and cons 
of the suggestion were listed. 

PROS: 

1. It would produce estimates that are additive. A procedure following 
the Committee's general consensus of states and higher would not be 
additive. 

2. It Is a move In the right direction. (This can also be viewed as a con 
since. It Is only a partial correction, even at the national level.) 

3. It dampens the effect of noise (bias, error, etc.) In the PES and 
census. 

4. At the substate level, the composite Is probably better than the full 
adjustment. 

5. Even with an adjustment, there would still be a benefit for respondents 
to take the effort to be counted In the future, because any potential 
adjustment based on the 50-50 composite method would only be a partial 
correction. 

6. Analysis done by one Committee member showed that hypothesis test 
results at the state level were much more favorable to the composite 
estimate than to the full adjustment, even without Including correlation 
bias. 

CONS: 

1. It' Is not as good an estimate at the national level as at the adjusted 
base, but It Is probably a better estimate than an estimate with a fiflly 
adjusted base for substate levels. Substate improvement Is at the expense 
of state and national estimates. 

2. The two estimates (the DSE and the census) are not Independent. 

3. It was too late to fully examine the technical merits of the composite. 

4. It Is only half a solution to differential undercount. 

5. It looks like a compromise or even like a "cop-out.* 

6. Why 50-50? 60-40 or some other combination might be better, and there 
Is no way to know. 



136 


33 


EXECUTIVE STAFF DECISION DISCUSSION 

Following the Committee discussion, the Executive Staff of the Census Bureau 
net to give their views. Basically, the Executive Staff concentrated on 
policy concerns since the Committee had discussed the technical Issues. The 
Executive Staff did not make a recommendation on whether or not to adjust the 
base for intercensal estimates, but rather raised some Issues. The following 
points were raised at the Executive Staff meeting: 

1. It Is very important to make sure that people understand that the 
decision on whether to adjust the base for Intercensal estimates Is 
different from the decision whether to adjust the full census. Even If 
there Is a decision to adjust the base for Intercensal estimates, there Is 
no Intention to adjust the 1990 census because research shows Insufficient 
technical Justification. 

2. The Census Bureau should do what It thinks It can support based on 
statistical science. 

3. The Census Bureau should consider the advice of users, but should not 
be forced into a decision because of pressure from user*. 

4. The Census Bureau should consider the effect of the decision on the 
public and in particular on Its respondents. 

5. The 50-50 composite suggestion looks arbitrary. 

6. The adjustment issue Is so complex, there Is probably no single 
intellectually coherent solution. Host likely, none of the available 
options Is fully consistent with the current research. Also, no matter 
what the decision, some people will not be satisfied. 

On balance, the Executive Staff felt very strongly that there should be 
technical support for the eventual decision. The Executive Staff recognized 
that many Issues, some of them nontechnical, would need to be balanced In 
making the final choice. Even so, It Is very Important for the Census Bureau 
to be confident about the technical support for the decision It chooses. Not 
only would the Census Bureau have to defend any decision, but the 
professionalism of the agency can be questioned if the Census Bureau cannot 
stand behind Its decision on statistical grounds. 



137 


FUTURE 

Regardless of the choice about whether to adjust the base for Intereensal 
estimates, there were several concerns about the future raised during the 
final discussions. Generally, it was felt that the problea of differential 
coverage will continue In the future. Therefore, there were strong 
recommendations that research in the area of differential undercount should 
continue as input Into the design of the year 2000 census. In particular, the 
following points were made. 

1. The Census Bureau should examine alternatives to the Dual Systea 
Estimation process used in 1590. Some of the problems of that approach may 
continue despite best efforts, meaning that a full adjustment based on such 
a system might never be possible. 

2. Even though it might not be statistically efficient, coverage 
measurement surveys in the future should have samples and estimation 
systems that produce direct estimates for key political areas (like 
states). 

3. The Committee process was very successful and could be a good model for 
the future. Examples of the benefits Included sufficient time, timely 
senior. staff input, clear goals, etc. 

4. Any proposed undercount estimation/adjustment scheme must be simple. 

It must be simple enough so the technical aspects can be evaluated and It 
must be simple enough so it can be explained, even to those without 
extensive statistical knowledge. 

5. Methods of Incorporating coverage measurement into the census process 
should be examined. 

6. A system that produces one set of counts rather than unadjusted and 
adjusted counts is definitely preferred, 



138 


Attachment 2 

LIST OF MEMBERS OF PANEL OF EXPERTS 
WHO ATTENDED THE HEETIN6 WITH THE CENSUS BUREAU 


Mr. Don Royce 
Senior Methodologist 
Statistics Canada Social Survey 
Methods Division 

Mr. Wesley Schaible 
Associate Commissioner 
Office of Research and Evaluation 
Bureau of Labor Statistics 

Dr. Fritz Scheuren 
Director, Statistics of Income 
Division 

Internal -Revenue Service 

Or. Bruce Spencer 
Department Head 
Statistics Department 
Northwestern University 

Dr. Theresa A. Sullivan 
Chair and Professor for the 
Department of Sociology 
University of Texas at Austin 


Dr. James Trussell 
Associate Dean of Woodrow Wilson 
School and Director of the 
Office of Research 
Princeton University 

Hr. Joseph Waksberg 
Chairman of the Board 
WESTAT 

Dr. Tommy Wright 

Research Staff Member 

Oak Ridge National Laboratory 

Dr. Donald Ylvlsaker 
Director for the Division of 
Statistics, Mathematics 
Department 

University of California 

Dr. Alan Zaslavsky 
Assistant Professor 
Statistics Department 
Harvard University 



139 


Attachment 1: List of 

COMMITTEE ON ADJUSTMENT OF POSTCENSAL ESTIMATES (CAFE) 

MEMBERS 


Dr. Barbara Everitt Bryant 
Mr. C. L. Kincannon 
Mr. William But* 

Mr. Charles Jones 
Dr. Robert Tort ora 
Mr. Peter -Bounpane 
Ms. Paula Schneider 
Mr. John Thompson 

Dr. Robert Fay 
Dr. Howard Hogan 
Dr. John Long 
Dr. Mary Mulry 

Dr. Gregory Robinson 
Mr. Michael Batutis 
Mr. David Whitford 


Director 
Deputy Director 
Associate Director 
Associate Director 
Associate Director 
Assistant Director 
Chief, Population Division 
Chief, Decennial Statistical 
Studies Division 
Senior Mathematical Statistician 
Statistical Research Division 
Population Division 
Decennial Statistical Studies 
Division 

Population Division 
Population Division 
Decennial Management Division 



140 


Page 1 of 3 

ATTACHMENT 3A: PES ESTTfPOES OF UCERCOUNT BY RACE AND SEX 
JULY, 1992 



Table of PCS Estiootes for Selected t»cc/Origin/Ses Cretan 

JULY, 1991 

Original PCS 


JANUARY, 1992 

■•vised PCS 


JULY, 1992 

357 PCS 


Rsce/Hispantc/Se» 

Total 

Male 

female 

Hack 

Male 

female 

Pen-Black 

Male 

female 

Aslan or Pacific Islander 
Male 
female 

American Indian 

Male 

femate 

Hispanic 

Mate 

female 


Census 

Estimate 

ltd. Irrer 

estimate 

ltd. Irror 

Estiemte 

Std. Crror 

2*8709873 

253979141 

472946.472 

252959473 

461310.829 

252712821 

489754.595 

121239*10 

124249093 

24 5 445.4 26 

123648997 

238663.637 

123623143 

273518.304 

127470455 

129730048 

246737.086 

129310476 

241383.831 

129089678 

254912.175 

29986060 

31505838 

95559.460 

31295058 

93633.743 

31377094 

167925.028 

14 170151 

14974382 

49052.934 

14857391 

47952.832 

14900868 

62912.806 

15815909 

16531456 

52914.183 

16437667 

51898.230 

16476225 

96609.126 

218723813 

222473303 

424675.175 

221664415 

414933.642 

221335728 

453076.281 

107069267 

109274711 

222153.799 

108791606 

216160.510 

108722274 

249791.220 

11165*546 

113198592 

220800.163 

112872809 

216539.374 

112613453 

239423.186 

7273662 

7504906 

36264.289 

7485602 

36157.768 

7447371 

102828.516 

3558038 

3688436 

19879.800 

3674532 

19946.424 

3684895 

60817.829 

3715624 

3816470 

18469.115 

3811069 

18435.209 

3762478 

57240.421 

1878285 

1976890 

21726.014 

1970537 

21588.870 

2051976 

26259.820 

926056 

980874 

11312.232 

977738 

11302.066 

1020059 

13248.050 

952229 

996016 

10612.782 

992799 

10467.531 

1031917 

13252.478 

22354059 

23590274 

103458.969 

23471101 

102033.476 

23521183 

180090.423 

11388059 

12086513 

57498. 441 

12008688 

56356.003 

12052241 

114778.144 

10966000 

11503761 

52275. 143 

11462214 

52082.441 

11468942 

84750.443 


Table of umlerceteit Pates for Selected Pace/Or <g in/tea Croups 

Original PCS 


Revised PCS 


357 PCS 


Race/KlspanicTSea 

Total 

Male 

female 

Black 

Male 

female 

Mon-Black 

Male 

female 

Asian or Pacific Islander 
Male 
femate 

American 'Indian 
Male 
female* 

Hispanic 

Male 

female 


Census 

UC Rt 

KlUC Rt) 

UC Bt 

SMUC Rt) 

UC it 

St(UC Rt) 

248709873 

2.075 

0.182 

1.680 

0.179 

1.584 

. 0.191 

121239418 

2.422 

0.193 

1.949 

0.189 

1.928 

0.217 

127470455 

1.742 

0.117 

1.423 

0.184 

1.254 

0.193 

29986060 

4.824 

0.289 

4.163 

0.287 . 

4.433 

0.511 

14170151 

5.371 

0.310 

4.626 

0.308 

4.904 

0.52? 

15815909 

4.328 

0.306 

3.783 

0.504 

4.008 

0.563 

218723813 

1.685 

0.168 

1.327 

0.185 

1.180 

0.202 

107069267 

2.018 

0.199 

1.583 

0.196 

1.520 

0.226 

111654546 

1.364 

0.192 

1.079 

0.190 

0.852 

0.211 

7273662 

3.081 

0.466 

2.831 

0.469 

2.332 

1.349 

3558038 

3.535 

0.520 

3.170 

0.526 

3.443 

1.594 

3715624 

2.642 

0.471 

2.504 

0.472 

1.245 

1.302 

1678285 

4.986 

1.044 

4.682 

1.044 

*4.520 

1.222 

926056 

5.569 

1.089 

5.286 

1.095 

5.183 

1.231 

952229 * 

4.396 

1.019 

4.086 

1.011 

3.864 

1.235 

22354059 

5.240 

0.416 

4.759 

0.414 

4.962 

0.728 

11368059 

5.779 

0.446 

S.T70 

0.445 

5.511 

0.900 

10966000 

4.675 

0.433 

4.329 

0.435 

4.385 

0.707 


Mete: Due to tr*e nature ef the data used to compute these (ami for the 357 posts trots PCI design, the American Indian cam* 
bath Table l end Table 2 above Include Csklmes end Aleut* for the 357 PM. The census cam use d for this froup ues 1.950,234. 
coietts used re Canute the original PCS catena and the revised PCI towns are ahem In the tables. 



141 


ATIACHMDJI 3B: REVISED PES ESTIMATES OF UNDERODlNr BY AGE- RACE- SEX 
JULY, 1992 


lablr 1 KS EstUatc* ter Selected Race/Or Ifin/Ses Crt»^» for the 0 to 17 Ape Croup 
(357 Poststrata PCS Deslfn) 

JULY, 1SS2 


tKf/Orif>n/S*> Grex^ 

Ceneus 

357 rts 

1st (note 

ltd. Crror 

Undercount 

Rate 

Standard 

Irror 

total 

63404412 

45695382 

191195.568 

3.183 

0.282 

hale 

32584278 

33649795 

97745.288 

3.166 

0.281 

heiU 

31020154 

32045587 

93459.542 

3.200 

0.282 

■ lack 

7584415 

10311019 

95917.245 

7.047 

0.865 

Male 

4849497 

5215800 

48390.736 

7.023 

0.863 

teazle 

4734918 

5095218 

47527.287 

7.071 

0.867 

eon- Slack 

54020017 

55384363 

172047.616 

2.463 

0.303 

Male 

27734781 

28433994 

88325.776 

2.459 

0.303 

rente 

24283238 

26950369 

83724.989 

2.468 

0.303 

asian or Pacific Ulander 

2083387 

2152880 

46537.029 

3.226 

2.092 

Male 

1063264 

1099038 

23792.412 

3.255 

2.094 

feewte 

1020123 

1053842 

22745.817 

3.200 

2.089 

Aaer lean Indian. Eaklao, or Aleut 

696967 

742996 

12481.466 

6.195 

1.576 

Male 

354875 

378205 

6315.004 

6.169 

1.567 

Senate 

342092 

364791 

6166.491 

6.222 

1.585 

Hispanic . . 

7757500 

8164634 

77292.661 

4.989 

0.899 

Male 

3971164 

4179630 

39551.088 

4.988 

0.899 

Senate 

3786336 

3985204 

37742.086 

4.990 

0.900 


mi« 2 KS Estieete* far Selected Race/Or IflnTSex Cm** for the 18 te 29 Ape Crm^ 
(357 Poetstreta ft* Design) 

JULY, 1992 

357 DCS 

Ihdereomt 

Standard 

Raee/Orlfln/Sa* 6r©*> 

Census 

Istlaata 

Rtd. Errar 

Rate 

Errar 


48050611 

49530134 

192936.681 

2.987 

0.378 


24312055 

25105216 

129869.843 

3.159 

0.501 


23738756 

24424918 

113605.768 

2.809 

0.452 


6419397 

6727151 

60784.870 

4.375 

0.062 


3110320 

3225832 

38478.196 

3.581 

1.150 


3309077 

3501319 

41388.086 

5.491 

1.117 


41631414 

42802983 

174778.637 . 

2.737 

0.397 


21201735 

21879384 

121313.350 

3.097 

0.537 


20429679 

20923599 

102738.356 

2.361 

0.479 

Asian er- Pacific Islander 

1581231 

1686549 

47226.618 

6.245 

2.625 


802067 

893983 

35821.446 

10.282 

3.595 


779164 

792566 

31415.861 

1.691 

3.097 

Mwrican Indian, takJao, pr Aleut 

414071 

441406 

7298.043 

6.193 

1.551 


210263 

224725 

4083.000 

6.435 

1.700 


203806 

216683 

3782.708 

5.942 

1.642 

■lapanlc 

5525130 

3903999 

83906.191 

6.417 

1.330 

2984897 

3207779 

67903. 944 

6.948 

1.970 

Senate 

2540233 

2696220 

31412.026 

5.785 

1.098 



142 


ATTACHMENT 3 
Page 3 of 3 

ATTACHMENT 3B: REVISED PES ESTIMATES OF UNDEKXJUNT BY AGE- RACE- SEX 
JULY, 1992 

Tabic 3 KS istiaota* to r Selected tact/Oriain/ln Cr*a* tor thm 30 U IV Crc^j 
1357 Peatstrata PCS 0**i*n) 


Raca/Oriffn/Saa Croup 

Census 

ifXi 

Istiaatt 

1992 

SCO. Error 

Undarcewnt 

■at* 

Standard 

Error 

Total 

733 H363 

74327J49 

178300.740 

1.343 

0.237 

Halo 

34701757 

34945492 

114334.225 

1.850 

0.304 

fOMl* 

37032406 

37341457 

*6074.030 

0.001 

0.232 

■ lack 

0300310 

4705742 

57437.333 

4.457 

0.429 

Halo 

3041742 

4099433 

30014.144 

0.290 

0.049 

faaala 

4450536 

4406129 

31219.727 

3.204 

0.654 

bon* Black 

43014065 

65421500 

144451.401 

0.92* 

0.234 

Mat* 

32439995 

32B46059 

104016.209 

1.296 

0.310 

foaalo 

32574050 

32755520 

90532.424 

0.554 

0.273 

Aslan or Pacific Islandar 

2373705 

2394349 

35297.044 

0.942 

1.459 

Halo 

1120527 

1127547 

23073.089 

*0.005 

2.119 

foaolo 

1245256 

1264702 

19001.040 

1.054 

1.470 

Aaarlcan Indian, f*k<ao, or Alsut 

543021 

540400 

5744.045 

2.950 

0.995 

NalO 

263521 

274134 

2012.700 

4.560 

0.972 

faaala. . 

200294 

244264 

3232.422 

1.397 

1.121 

Hispanic 

5961207 

4271153 

41500.742 

4.942 

0.932 

Mala 

3029043 

3225477 

40130.944 

4.090 

1.140 

faaala 

2932144 ■ 

3045476 

33430.513 

3.727 

1.057 


tabl« 4 HI liiiwlw tor Selactad Saca/Orl* in/lu ln^§ far tlte SO OI*r Aft tra^ 
4357 Paarstreta 




July, 19M 



•aca/Orifin/Sa* Cro^ 

Canaus 

337 PES 
Estlaata 

ted. Error 

Underemne 

kata 

Standard 

error 

fatal 

43740247 

43139934 

144191. 019 

•0.919 

0.242 

Mala 

28041328 

27902440 

91400.020 

-0.349 

0.329 

faaala 

35470939 

35237316 

90573.330 

•1.193 

0.203 

Slack 

5441930 

3433142 

34074.194 

•0.044 

0.424 

Mala 

2340572 

2359403 

22227.003 

-0.300 

0.944 

faaala 

3313350 

3273339 

19516.909 

•1.216 

0.403 

ton- 11 act 

50058337 

37324794 

* 159823.394 

-0.924 

0.200 

Mala 

25492734 

25542037 

•9232.535 

*0.307 

0.351 

faaala 

32345501 

31903957 

94047.229 

•1.193 

0.304 

Asian or Pacific islandar 

1235259 

1211593 

20344.491 

• *1.953 

1.732 

Mala 

544100 

544307 

7192.919 

0.023 

1.274 

faaala 

471079 

447204 

10017.033 

•3.674 

2.086 

aaarlcwt Indian, Eskiaa. or Alaut 

304373 

307172 

3091.413 

0.911 

0.997 

Mala 

130323 

140994 

1032.019 

1.734 

t-277 

faaala 

165032 

144176 

1534.022 

0.195 

0.933 

Hispanic 

Mala 

3110222 

3181190 

43724.253 

2.231 

1.405 

1402933 

1439354 

27994.289 

2.529 

1.894 

faaala 

1707247 

1741042 

32479.412 

1.985 

1.839 






144 


Page 1 of 2 


ATTACHMENT S: THE 3S7 POSTSTRATUM DESIGN 
FOR POSTCENSAL ESTIMATION— JULY, 1992 

The following page defines the 51 poststrata groups and seven age 
sex groups used to poststratify the Post-Enumeration Survey 
(PES) . These were used to develop dual system estimates for use 
in the postcensal estimation program. Cross classification of 
the 51 poststrata groups with the seven age sex groups yields 
357 poststrata cells for which dual system estimates have been 
developed. 

The following rough definitions are used: 

"Urbanized area 250,000+” means that the PES sample block was 

part of an Urbanized Area the total population size of which 
was greater than 250,000. 

"Other -urban" refers to all PES blocks that were part of an 

Urbanized Area not greater than 250,000 or were part of an 
other urban place. 

"Non-urban” means all rural areas and other areas not falling 
into the above categories. 

"Owner /Non-Owner" is determined from the tenure variable on the 
PES questionnaire. All persons in group quarters are 
non-owners by definition. 

"Asian and Pacific Islander” refers to all people who report 

themselves as being Asian and Pacific Islander. This group 
is not restricted to the West or Mid Atlantic as it was in 
the July# 1991 estimates. Asians and Pacific Islanders of 
Hispanic origin are included here. 

"American Indians on Reservations" include American Indians 

living on reservations and Tribal Trust Lands. All other 
concepts (Black, Non-black Hispanic, etc.) are defined as in 
the census. 

"North East" states are Connecticut, Maine, Massachusetts, New 

Hampshire, Rhode Island, Vermont, New Jersey, New York, and 
Pennsylvania. 

"South" states include Delaware, District of Columbia, Florida, * 

Georgia, Maryland, North Carolina, South Carolina, Virginia, ^ ... 
West Virginia, Alabama, Kentucky, Mississippi, Tennessee, . ^ 

Arkansas, Louisiana, Oklahoma, and Texas. 

"Midwest" states are Illinois, Indiana, Michigan, Ohio, 

Wisconsin, Iowa, Xansas, Minnesota, Missouri, Nebraska, 'V t \ 

North Dakota, and South Dakota. 

"West" states include Arizona, Colorado, Idaho, Montana, Navada, 

New Mexico, Utah, Wyoming, Alaska, California, Hawaii, 

Oregon, and Washington. 



145 


VJUl ytrf * «. .. 

Revised Post-Sfrcrtaflflccrtlon for Posfcensal Esllmcrtlon 


Post-Strata Groups 

North East 

MldWest 

South 

West 

Non-Hlspanlc White A Other 

Urbanized Areas 250,000 ♦ 

1 

2 

3 

4 

Olher Urban 

6 

e 

7 

a 

Non-Urban 

• 

10 

11 

12 

Non-owner 

Urbanized Areas 250,000 * 

13 

14 

15 

IB 

OUter Urban 

17 

. IB 

13 

20 

Non-Urban 

21 

22 

23 

24 


Black 

Ownar 

Urbanized Areas 250,000 i 
Other Urban 
Non-Urban 
Non-owner 

Urbanized Areas 250,000 * 
Other Urban 
Non-Urban 

Non-Blade Hbpartto 
Ownar 

.... Urbanized Areas 250,000 
Other Urban 
Non-Urban 
Non-owner 

Urbanized Areas 250,000 
Other Urban 
Non-Urban 

Aslan tk Pacific Islander 
Owner 
Non-owner 

'American Indians on Reservations 




Age-Sax Groups 





146 


rage x ox * 


ATTACHMENT 6: 

Total Error of the Net Undercount Rate 
Assuming No Correlation Bias and Synthetic Estimation 
of Net Component Errors 

JULY, 1992 


Evaluation 

JL* 

kin 

SL-Dsy. 

Total 

95% Interval 

Poststratum 



B(U) 

St. Dev. 


Non-Hlspanlc White and 

Other, Owner 




Urban 2S0k+ 

-0.50 

0.32 

0.99 

1.06 

(-2.95, 1.31) 

Other Urban 

0.11 

0.21 

0.25 

0.34 

(-0.79, 0.59) 

Non-Urban 

-0.22 

0.86 

0.87 

1.00 

(-3.07, 0.92) 

Non-Hlspanlc White and 

Other,. Non-Owner 



Urban 250k+ 

2.33 

-0.06 

0.60 

0.96 

(0.47, 4.32) 

Other Urban 

2.92 

1.70 

0.82 

1.13 

(-1.03, 3.47) 

Non-Urban 

5.30 

0.47 

0.74 

1.35 

(2.13, 7.53) 

Black, Non-Black 

Hispanic, Aslan and 

Pacific Islander, Urban 

250k* 

Owner 

1.33 

0.84 

0.44 

0.67 

(-0.86, 1.82) 

Noo-Owner 

7.13 

0.80 

0.48 

0.94 

(4.44, 8.21) 

Black, Non-Black 

Hispanic 

, Aslan and Pacific Islander, 

, Other Urban & Non- 

Urban 






Owner 

2.07 

2.38 

0.90 

1.25 

(-2.81, 2.18) 

Non-Owner 

6.44 

3.98 

0.94 

1.63 

(-0.80, 5.72) 

National 

1.61 

0.73 

0.30 

0.36 

(0.17, 1.60) 


Based on PES population only. 



147 


Assuming Synthetic Estimation or Net component aivn 


Evaluation 

£L‘ 

B(U) 

St. Dev. 

JotaJ 

95% Interval : 

Poststratum 



B(U) 

St. Dev. 

• 

Non-Hlspanlc 

White and 

Other, Owner 



Urban 250k+ 

-0.50 

0.31 

0.99 

1.06 

(-2.94, 1.32) 

Other Urban 

0.11 

0.18 

0.25 

0.34 

(-0.76, 0.62) 

Non-Urban 

-0.22 

0.81 

0.88 

1.00 

(-3.03, 0.97) 

Non-Hlspanlc 

White and 

Other, Non-Owner 



Urban 2S0k+ 

2.33 

-0.68 

0.76 

1.07 

(0.87, 5.16) 

Other Urban 

2.92 

1.54 

0.84 

1.14 

(-0.90, 3.65) 

Non-Urban 

5.30 

-0.12 

0.90 

1.45 

(2.52, 8.31) 

Black, Non-Black Hispanic, Aslan and 

Pacific Islander, Urban 

i 250k* 

Owner 

1.33 

0.80 

0.45 

0.68 

(-0.83, ^) 

Non-Owner 

7.13 

-1.37 

1.30 

1.54 

(5.42. 11.56) 

Black, Non-Black Hispanic, 

, Aslan and 

Pacific Islander, 

Other 

Urban & Non- 

Urban 






. Owner 

2.07 

2.23 

0.95 

1.28 

(-2,71, 2.41) 

Non-Owner 

6.44 

3.55 

1.05 

1.70 

(-0.50, 6.28) 

National 

1.61 

0.35 

0.33 

0.38 

(0.50, 2.03] 


Based on PES population only. 



148 




149 


ATTACHMENT 8: THE MEETINS WITH THE PANEL OF EXPERTS 

hlle the Panel came to no consensus about whether the base for Intercensal estimates 
hould be adjusted, the Panel was extremely Impressed with the extensive research done by 
he Census Bureau. The concerns raised by the Panel were not criticisms of the Census 
ureau's work, but rather were Indications of the difficulty and complexity of the overal^fc 
ssue as well as the fact that some of these problems may never be fully solved. The 
anel concentrated Its discussion on five areas as requested by the Census Bureau. These 
ere the most difficult problem areas that Census Bureau statisticians had not been able 
o fully resolve. Not only was the discussion limited to difficult problem areas, but as 
equested by the Census Bureau, the Panel members were critical and raised concerns, 
leading just a list of concerns can lead to an unbalanced view of what Panel members felt 
ibout the adjustment Issue In general. Therefore, the parameters under which the Panel 
iperated should be kept In mind In order to put the following more detailed discussion of 
’anel concerns In proper perspective. 

FIRST AREA: TOTAL ERROR MODEL INCLUDING CORRELATION BIAS 

During this discussion the Panel mentioned that It didn't see an easy alternative to 
the current method of treating correlation bias, but Panel members were uneasy about 
certain aspects of It. For one, the Panel was quite concerned about the negative 
fourth cells. In addition, there was concern that we weren’t estimating the level 
of the bias properly. In particular, one Panel member felt we should consider 
comparing the unbiased PES estimates (taking out the bias) to DA In order to 
estimate the level of correlation bias. Another panel member expressed serious 
concern that the Census Bureau assumed all correlation bias was male. This panel 
member pointed to his research to show that there also are problems of differing 
capture probabilities In the female population. Currently, the Census Bureau's 
treatment of correlation bias assumes that doesn't occur. It was also during th1s_ 
discussion that most of the Panel recommended that the Census Bureau try to removffe 
the bias from the PES estimates before making any adjustment. Another panel memboV 
went through the PES/DSE process In some detail with an emphasis on whether or not 
It was understandable to an average person and whether or not It was creditable. He 
pointed out several parts of the process that were of concern to him particularly 
the extensive use of Synthetic estimation. He also cautioned that if new research 
between July 1991 and the present uncovered new findings, then he wouldn't be 
surprised to see additional research after July 1992 turn up new results and new 
estimates of undercount. Another Panel member strongly desired that total error be 
broken out separately by persons of Hispanic ethnicity. This section of the meeting 
concluded with a discussion of the problem of Inconsistent race classification 
between systems (example: PES and DA), which the Panel felt was a significant issue 
that needed further research. 


SECOND AREA: LOSS FUNCTION ANALYSIS 

This part of the meeting was quite technical, with a review of the various loss 
functions under consideration. Most of the Panel advised against counting up 
winners and losers (For example: states that gained or lost in a loss function 
analysis done on states). Instead one Panel member recoonended a Pitman nearness 
measure which he uses when faced with this kind of problem. Then, there was a 
discussion of aggregate loss. The Panel pointed out that decisions on aggregate 
loss may make sense statistically, but that the "losing" political areas might have 



150 


2 

a problem. Also, It was during this discussion that the Panel made a recommendation 
that the results of loss function analysis be used with caution. Loss function 
analysis Is a tool, depends on personal standards of judgenent, and Is not an exact 
decision mechanism. It also was during this discussion that the Panel reiterated a 
these they raised In the first topic. Panel members were concerned that there Is 
too auch confusion about the undercount/adjustaent Issue by the ‘person on the 
street.* The Panel recommended that the Census Bureau try to alleviate that In the 
future, finally, there was a discussion about the large number of states for which 
It doesn't natter auch whether or not there Is an adjustment. Both sides of the 
case were discussed, If so, why bother to adjust?; or If so, adjust all states In 
order to correct a problea In a few states and the error In most other states won't 
be too bad. This discussion ended with another theme heard often. The total error 
aodel Is a good tool to try alternative assumptions. It Is not an exact decision 
mechanism. 


THIRD AREA: HYPOTHESIS TESTS 

The Census Bureau had recognized the llaltattons of loss function analysis. In 
particular, once you had two losses to compare, was the difference between then a 
‘real* difference, or could It be attributable solely to chance since these were 
sample estimates. To help answer that question, the Census Bureau planned some 
statistical hypothesis tests. The Panel was asked to review the Census Bureau 
plans. 

This part of the discussion was led by the expert from Statistics Canada,, since 
Statistics Canada was faced with a similar problem. The discussion was extremely 
technical. Before getting to the Issue of the hypothesis test, the Panel member 
cautioned that several key questions had to be answered, and they all had an effect 
on the eventual hypothesis test. These questions Included: 

Ifhat Is the quantity of Interest? (Total population, population share, etc.) 

Milch Loss Function would be used? 

How accurate are your target numbers? 

How do you account for error In estimating the target numbers? 

The bulk of the discussion centered about the technical performance of the 
hypothesis test assuming the above questions had been answered satisfactorily. 
Basically, the Panel pointed out that we were not simply dealing with a standard 
hypothesis test. Instead, we planned to use one of the set of estimates based on 
the results of the hypothesis test. Under those conditions, a model could be 
developed to examine the true level of risk for the hypothesis test. At present. 
Statistics Canada had developed such an approach. The Panel member urged the Census 
Bureau to take this finding Into account In the significance level of the Census 
Bureau's proposed hypothesis test. During this part of the discussion, this panel 
■ember warned that if there Is a high positive bias In the estimate of undercount, 
then the hypothesis test can be misleading, and In fact, adjustment can be very 
problematic when the estimate of undercount has a large bias. Also, It was pointed 
out that Statistics Canada feels Its estimates of undercount at the province level 



151 


3 


ire adequate for use In adjusting intercensal estimates, but not at sub-province 
level. Whether or not to adjust below the Province level will be more a policy call 
than a technical decision. Finally, it was during this part of the meeting that the 
Panel repeated Its recommendation that if estimates of bias are good enough for uy^ 
in determining target numbers for loss function analysis, then they should be yp 
removed from the PES estimates before any potential adjustment. 


FOURTH AREA: ARTIFICIAL POPULATION ANALYSIS 

Because of the way the PES/DSE system operates, the homogeneity assumption Is a key 
one. In conjunction with the July 1S91 decision, the Census Bureau studied 
homogeneity and recorded the results In study called P-12. Since the homogeneity 
assumption was so key, the Census Bureau undertook additional work in a study called 
Artificial Population Analysis. The Panel was asked to examine. various aspects of 
the analysis. The Panel member who did part of the P-12 study led the discussion. 

The Panel member started with a brief review of study P-12 which he characterized as 
Inconclusive. In reviewing the artificial population analysis, he thought the 
Census Bureau had taken a major additional step to try to investigate the Issue, but 
he still felt the results were inconclusive. In his opinion, only two of the eight 
surrogate variables considered by the Census Bureau were associated enough with 
undercount to be considered. (Percent enumerated by mall and substitution rate.) 

He wondered If there were better alternative surrogate variables. The Panel also 
expressed some concern about the constant scaling of the surrogate variables to 
undercount. Variable scaling might be preferred. Likewise, the Panel was concerned 
about the constant introduction of bias into the artificial population analysis. 

Once again, variable bias would be preferred. Even so, the Panel was concerned that 
artificial population analysis showed failure of the homogeneity assumption when the 
constant bias was 25X or greater. One panel member did some work on his own. Fnm 
that study, he concluded that by using substitution rate, adjustment looks bette^H 
Using poverty, the results are mixed. And, using unemployment rate, the census 
looks better. This kind of analysis supports the conclusion that even with all the 
new research, the results are inconclusive. This panel member felt that a 
considerable amount of additional work would be needed to get a definitive answer on 
whether the homogeneity assumption held. 


FIFTH AREA: COMPARISON OF PES TO DA 

Generally, at the national level, estimates of population from DA are felt to be 
'better' than estimates from a post-censal survey. Even so, the DA estimates are 
subject to some error. Before discussing the comparison of the PES and DA, one 
panel member shared her work on the quality of DA numbers. In addition to the knows 
problems with DA, she pointed out some additional places where the DA estimates 
could be In error. These Included: 

1. Over correction for the under-registration of black males. (This error has the 
effect of overestimating the undercount.) 

2. The problem of Mexicans near the border who register the birth In the US, but 
then return to Mexico to raise the child. (This problem has the effect of 
overstating the undercount.) 



152 


4 

3. Under reporting of infant deaths near the border since the birth certificate can 
be resold. (This problem overstates the undercount.) 

4. Concerns about the consistency and reliability of reporting data on vital 
statistics forms, especially those done by a third party. (These types of errors 
might not effect the estimate of total undercount, but would effect the estimates by 
age-race- sex.) 

5. Concern about a change in a person's self perception of race/KIspanlc over time. 
These characteristics could be recorded one way at birth and another at death. 

(This problem only has an effect on DA estimates of undercount by race/KIspanlc.) 

Even with these and other problems, there Is still general confidence In the DA 
estimates, particularly at the national level. That Is why the Panel was concerned 
about some Inconsistencies between the PES and DA. In particular, one panel member 
reviewed the Census Bureau work that compared PES estimates by state with DA and 
other information. She was quite concerned about the states that seemed quite 
inconsistent. At this point, another panel member Indicated that another 
independent study he had done confirmed the Inconsistency In a similar set of 
states. The Panel discussed the Issue and concluded that In an adjustment where 
there would be overall improvement for states, some states would be adversely 
affected, even If most were Improved and the US average was Improved. The Panel 
strongly recoaraended that the Census Bureau examine If these exception states were 
hurt ■seriously." 

meeting closed with a brief discussion of the actual mechanism of the Intercensal 
estimate process. During that discussion, there was a question about the accuracy of 
Intercensal estimates. That question couldn't be answered exactly, but there was some 
summary Information provided. Basically, by comparing the estimate In a census year to 
the census count, you can estimate the error In the estimates over a 10-year period. The 
following table suamarlzes the Census Bureau findings. 


AREA 

LEVEL OF ERROR OVER 

10 TEARS’ 

States 

1.5 -2.5* 

Places over 50,000 

4. OX 

Places 5,000 to 50.000 

7.0 - 8. OX 

Places under 5,000 

16.0 - 20. OX 


'Level of error as measured In previous decades. These error estimates 
exclude any estimated undercoverage In the census. 













153 


ATTACHMENT 9: USES OF INTERCENSAL ESTIMATES ANO ISSUES CONSIDERED BY C.A.P.E 
Uses of Intercensal Estimates: 

1. Survey controls 

2. Denominators for per capita Federal statistics 

3. Funding programs 

a. State populations either for direct funding or as the first 
tier in a funding program 

b. Substate areas. of 100,000 population or larger 

c. Substate areas below 100,000 population 
Other Concerns: 

1.. National population estimates 

2. Differential undercount and the perception of fairness 

3. Overall accuracy 



154 


















159 



013 lei land Courty 

uoorw 

i c ypoj 

u>/ „ 




015 virtfia* County 

107525 

104554 1.941 

0.623 

103793 

1.2zi 0.687 

10 

001 Kent Comty 

110993 

114068 2.696 

0.443 

112995 

1.772 0.39i 


003 Hew C»ftle County 

441946 

456338 3.154 

0.510 

450294 

1.854 0.516- 



113229 

116255 2.603 

0.501 

115083 

1.811 0.452 


001 District of Colmfela 

606900 

638747 4.986 

0.517 

626309 

3.407 0.90V 

12 

001 Alachua County 

181596 

188223 3.521 

0.429 

186051 

2.394 0.635- 



126994 

130912 2.993 

0.477 

129096 

1.629 0.536 

'12 

009 Ireverd Courty 

398971 

410499 2.607 

0.446 

404953 

1.476 omt 

12 


1255488 

1291812 2.812 

0.453 

1277394 

1.715 

12 

015 Charlotte Comty 

110975 

112871 1.660 

0.526 

111891 

0.825 ran 

12 

010 Clay County 

105916 

106604 0.766 

0.595 

107762 

1.648 0.376 

12 


152099 

156294 2.684 

0.J26 

154951 

1.145 0.464 

12 

025 Dade Comty 

1937094 

1997643 3.031 

0.591 

2011300 

3.690 0.945 

12 


*72971 

697735 3.549 

0.463 

687821 

2.159 0.549 

12 

033 laeaafcl# County 

262798 

271007 3.029 

0.466 

268329 

2.061 0.495 

12 

053 Hernando Court y 

101115 

100975 *0.139 

0.612 

102051 

0.911 0.319 

12 

057 Hi l latoorough County 

B34054 

851177 2.092 

0.448 

653411 

2.268 0.478 

12 

069 lake County 

152104 

155095 1.929 

0.481 

154003 

1.233 0.341 

12 

071 lee County . 

335113 

343538 2.452 

0.465 

339519 

1.310 0.466 

12 


192493 

199708 3.613 

0.437 

196621 

2.100 0.615 

12 

081 Manatee County 

211707 

216819 2.358 

o.soe 

214609 

1.352 0.513 

12 

083 Marion Comty 

194833 

199845 2.508 

0.487 

197743 

1.472 0.354 

12 

089 Martin Courty 

100900 

103232 2.259 

0.592 

102120 

1.195 0.406 

12 

091 Okaloosa County 

143776 

148410 3.122 

0.505 

146346 

1.756 0.593 

12 

095 Oranee County 

477491 

700574 3.295 

0.458 

693622 

2.326 0.530 

12 

097 Osceola Courty 

107728 

111188 3.112 

0.564 

109720 

1.816 0.479 

12 

099 Halo leach Couity 

8635 18 

886676 2.612 

0.464 

876764 

1.511 0.493 

12 


281131 

281049 *0.029 

0.614 

283694 

0.904 0.395 

12 - 

103 Pinal III County 

851659 

861306 1.120 

0.448 

860431 

1.020 0.555 

12 

105 Folk County 

*05312 

416923 2.768 

0.470 

411911 

1.587 0.405 

12 

I'll ft. lucle Courty 

150171 

154362 2.715 

0.479 

152554 

1.562 0.474 

12 

1t9 Sarasota County 

277776 

283554 2.038 

0.350 

279921 

0.766 0.505 

12 

117 tenlnole Cemty 

287529 

'■ 297007 3.191 

‘o:s*9.- 

292736 

1.77V 0.505 

12 


370712 

380601 2.598 

0.512 

373737 

1.331 0.463 

IS 

021 Xbb Couity 

149967 

154963 3.224 

0.453 

133035 

2.005 0.475 

IS 


216935 

224122 3.207 

0.435 

221102 

1.885 0.506 

IS 

063 Clayton County 

182052 

184137 1.132 

0.562 

186841 

2.563 0.561 

IS 

067 Cobb Comty 

*47749 

453335 1.277 

0.544 

45*480 

1.914 0.547 

IS 

089 Decslb Couity 

5*5837 

553706 1.421 

0.533 

561155 

2.730 0.601 

IS 

121 Fulton Courty 

648951 

671488 3.356 

0.442 

668695 

2.953 0.731 

IS 

135 Gwinnett Comty 

352910 

356619 1.040 

0.611 

359473 

1.126 AUt 

IS 

219 Muscoyee County 

179278 

185474 3.341 

0.505 

183097 

2.066 B 

IS 

2*9 aielwond County 

189719 

195914 3.162 

0.443 

194873 

2.645 

19 


120317 

121720 1.153 

0.717 

122654 

1.905 0.751 

IS 

003 Honolulu Courty 

836231 

*61243 2.904 

0.370 

852074 

1.859 0.131 

19 

009 Haul County 

100374 

101991 1.191 

0.714 

102187 

1.774 0.74 

16 

001 Mo Couity 

205775 

208426 1.272 

0.594 

209575 

1.813 0.46: 

1? 

019 OiMpelan Comty 

173025 

177031 2.263 

0.553 

175373 

1.340 0.4V 

17 

031 Cook Comty 

5105067 

5212195 2.055 

0.423 

5186429 

1.569 0.57 

17 


781666 

789453 0.986 

0.499 

784956 

0.41* 0.39 

17 

089 Kane Comty 

317471 

324570 2.187 

0.524 

320253 

0.169 0.41 

17 

097 lake Comty 

516418 

524672 1.573- 

0.558 

319660 

0.624 0.33 

17 

099 la Salle Comty 

106913 

106411 -0.472 

0.538 

107150 

0.222 0.41 

17 


183241 

184777 0.831 

0.510 

1B37BO 

0.293 0.31 

17 

113 net ten Comty 

129110 

131127 2.008 

0.512 

130121 

0.729 0.41 

17 


117206 

119350 t.961 

0.570 

U7JS4 

0.551 0.3! 

17 

119 Madison Comty 

249231 

251156 0.764 

•0.432 

ad* 

0.413 0.31 

17 

143 Feeria Comty 

182127 

116534 1.917 

0.534 - 

ietno 

0.735 0.3*. 

17 

161 lock island Comty 

148723 

151424 1.714 

0.534 

149787 

0.711 0.4! 



262852 

266701 1.443 

0.423 

266421 

1.340 0.41 

17 


178386 

161571 1.758 

0.542 

179149 

0.426 0.3* 

17 

179 Tmeevett Comty 

123692 

124872 0.943 

0.541 

123942 

0.202 0.41 

17 


357313 

363J30 1.710 

0.554 

359200 

0.325 0.2 

17 

201 WlnrwtoafO County 

252913 

257702 1.858 

0.521 

254302 

0.546 0.3 

11 

003 Allen Comty 

300636 

306760 1.911 

0.534 

302274 

0.476 0.3 



119659 

121730 1.701 

0.537 

120341 

0.566 0.4 

11 

039 llkhart Comty 

156191 

151664 1.554 

0.530 

156797 

. 0.382 0.4 

11 


108916 

109674 0.673 

0.513 

109211 

0.252 0.3 



475594 

417249 1.392 

0.552 

480322 

0.9*4 0.< 



107066 

107036 *0.021 

0.462 

107366 

0.211 0.< 

IS 

999 Madison Comty 

130669 

132535 1.406 

0.514 

131090 

.0.321 0.4 



160 


ia 

097 ftar ion County 

797139 

803090 

0.037 

0.577 

008143 

1.359 

o.s; 

A 

105 Honrow County 

106978 

111004 

1.096 

0.552 

110094 

1.013 

0.4‘ 

■ 

127 forter County 

120932 

130033 

0.040 

0.659 

129207 

0.274 

0.3 1 


HI ST. Jwsph County 

247092 

23178* 

1.000 

0.535 

240403 

0.544 

0.3! 

. 10 

157 Tlppseanos County 

130590 

133031 

1.029 

0.550 

132090 

1.135 

0.4! 

It 

163 Vandsrturth Ctmty 

163030 

160249 

1.097 

0.596 

1*5711 

0.394 

0.4 

18 

167 Vi 90 County 

106107 

107712 

1.490 

0.517 

106607 

0.4*9 

0.3 

It 

013 Hock Hawk County 

123790 

12*453 

2.100 

0.553 

124329 

0.507 

0.3 

19 

113 Lfm County 

160767 

171900 

1.023 

0.5*1 

169329 

0.332 

0.3 

19 

153 folk County 

327140 

334027 

2.062 

0.537 

329530 

0.725 

0.4 

19 

163 Scott County 

150979 

134206 

2.093 

0.533 

152246 

0.032 

0.4 

20 

091 Johnson County 

353036 

330306 

0.930 

0.435 

357029 

0.553 

0.4 

20 

173 Sodowlck County 

403662 

409349 

1.309 

0.407 

407780 

1.010 

0.4 

20 

177 Shsuns* County 

160976 

1*4773 

2.304 

0.525 

161045 

0.537 

0.3 

20 

209 Wyondotto County 

16199} 

163*74 

2.222 

0.49* 

1*420* 

1.14* 

0.4 

21 

067 foyotto County 

223366 

233197 

3.342 

0.602 

229930 

1.905 

0.7 

21 

111 Jsffsrson County 

664937 

605007 

2.930 

0.439 

*7*77* 

1.749 . 

0.5 

21 

117 Konton County 

142031 

149923 

2.400 

0.993 

144235 

1.520 

0.5 

22 

017 Caddo forts* 

240233 

296120 

3.072 

0.420 

25439* 

2.400 

O.S 

22 

019 Calcaoiou farls* 

160114 ■ 

172029 

2i7!7 

0.409 

170974 

1.6*1 

0.4 

22 

033 s»*t Baton (suit fsrish - 

300109 

392277 

3.103 

0.395 

390149 

2.574 

0.5 

22 

051 Jsffsrson forls* 

444306 

490900 

2.326 

0.470 

437937 

2.103 

O.S 

22 

053 kofsyottt fsrish 

164762 

169013 

2.974 

0.409 

1*0129 

2.000 

0.* 

22 

071 Orisons fsrish 

496930 

5H3S0 

3.424 

0.406 

51393* 

3.307 

o.t 

22 

073 Ouachita fsrish 

142191 

146297 

2.007 

0.400 

144953 

1.905 

0.4 

22 

079 Ragldts 'fsrish 

131994 

133009 

2.612 

0.399 

133999 

1.020 

0.3 

22 

103 St. la— ny fsrish 

144300 

147804 

2.230 

0.431 

14*074 

1.611 

0.3 

23 

001 Androscoggin County . 

10323? 

104912 

•0.331 

0.589 

10*120 

0.012 

O.S 

23 

005 Cu—rland County 

24113$ 

"243619 

0.197 

0.539 

24524* 

0.861 

0.5 

23 

011 Ronnsbsc County 

115*04 

117901 

1.399 

0.693 

11*902 

0.581 

0.5 

23 

019 fsnohacot County 

146601 

147374 

0.699 

0.5*3 

147730 

0.770 

0.5 

23 

031 Tort County 

164307 

166109 

0.914 

0.952 

169633 

0.633 

0.5 

2* 

003' Ann* Arundel County - 


-431624 

4*016 

0.537 

434447 

1.659 

0.4 

24 

005 OaltHaor* County 

692134 

696229 

0.900 

0.967 

702812 

1.519 

0.! 


013 Carroll County 

123372 

124098 

0.909 

0.606 

124911 

1.232 

0.4 

■ 

017 Chsrlos County 

101134 

102192 

1.016 

0.371 

102794 

1.595 

0.4 

V 

021 frsdsrlck County 

130200 

132604 

1.570 

0.494 

' 152690 

1.626 

0.4 

24 

023 Harford County 

182132 

103499 

0.749 

0.983 

105010 

1.560 

0.: 

24 

027 Howard County 

107320 

109033 

0.902 

0.9*2 

190409 

1.610 

o.< 

24 

031 Nontf— ry County 

737027 

7641 U 

0.979 

0.9*3 

771160 

1.033 

0.4 

24 

033 frincs George's County 

729260 

740060 

1.490 

0.179 

751507 

2.970 

c.< 

24 

043 Washington County 

121393 

124002 

2.732 

0.4*4 

123237 

1.496 

o.< 

24 

310 Islftnors city 

7)6014 

772082 

4.672 

0.311 

799127 

3.043 

o.) 

2S 

001 IsmotaOls County 

166609 

109009 

1.729 

0.099 

187904 

0.691 

0.! 

2) 

003 lerk Shirt County 

139332 

139722 

0.2*3 

0.320 

140900 

0.023 

0.! 

23 

003 Oristoi County 

306323 

909239 *0.212 

0.994 

509*37 

0.6S0 

0. 

23 

009 Issos County 

670000 

670474 

0.039 

0.179 

671451 

0.204 

0. 

23 

013 Masgden County 

456310 

437*99 

0.347 

0.989 

490054 

0.301 

0. 

23 

013 Hoopshlrc County 

146960 

147943 

0.929 

0.9*3 

147840 

0.866 

0. 

23 

017 MldOlossa County 

1390460 

1402907 

0.316 

0.600 

1399207 

0.053 

0. 

23 

021 Sort Oik County 

416007 

618007 

0.324 

0.693 

611139 

•0.010 

0. 

23 

023 fly— «h County 

435274 

• 43*306 

0.294 

0.900 

436400 

0.258 

0. 

23 

023 Suffolk County 

663906 

670099 

0.924 

0.744 

600010 

2.404 

0. 

23 

027 Worcostsr County 

709705 

711296 

0.210 

0.337 

713339 

0.509 

0. 

2* 

*017 Soy County 

11)723 

113132 

1.243 

0.537 

111095 

0.153 

0. 

26 

021 isrri on ‘County 

161370 

163661 

1.393 

0.598 

162674 

0.796 

0. 

26 

•025 Cgjfngun County 

OkO'tfstfps#* County 

063 Ifntf— County 

135902 

130140 

1.360 

0.517 

136672 

0.505 

0. 

26 

430439 

438000 

1.901 

0.530 

434600 

0.953 

0 . 

26 

201912 

208509 

2.209 

0.534 

286009 

1.460 

0. 

26 

073 Jackson County 

1497S6 

131333 

1.173 

0.526 

130189 

0.200 

0 . 

26 

077 sol— 100 County 

223411 

227212 

1.673 

0.520 

224997 

0.607 

0. 

26 

001 Kont County 

300611 

309273 

1.697 

0.52* 

904393 

0.730 

0. 

26 

091 Livingston County 

113*49 

11*408 

0.63* 

0.511 

113499 

*0.126 

0. 

26 

099 (tocoakt County 

717400 

722997 

0.719 

0.522 

7187*6 

0.190 

0 . 

26 

119 Monroe County 

133000 

134*42 

0.774 

0.511 

133783 

0.137 

0 

26 

121 Ruskogon County 

130903 

1*1494 

1.333 

0.535 

159784 

0.501 

0 

26 

129 Ooklond County 

1003392 

1094932 

1.036 

0.481 

1088374 

0.439 

0 

26 

139 Ottowa County 

1077*0 

189993 

1.131 

0.605 

188480 

0.3*7 

0 

mk 

US Saginaw County 

21194* 

2101S3 

1.947 

0.S37 

2135*7 

0.739 

0 

m 

* 

147 St. CUif County 

143*07 

147341 

1.177 

0.440 

149894 

0.1*9 

0 


161 Washtenaw County 

282937 

288*79 

1.989 

0.91* 

206038 

1.004 

0 

26 

163 woyns County 

2111*07 

2160394 

2.233 

0.42* 

2144402 

1.529 

0 



161 


27 

003 Anoka Coor.Tr 

243041 

245062 

0.903 

0.517 

244251 0.250 0.37a 

27 

037 Dakota County 

275227 

270030 

1.011 

0.512 

27*471 0.450 0.30$ 

27 

053 Nsnnepin County 

1032431 

1044052 

1.109 

0.301 

1041265 0.040 0.407 

27 

109 Olattsd County 

10647® 

100411 

1.790 

0.553 

100753 0.265 C.41T 

27 

123 l«B«r County 

405703 

491319 

1.130 

0.362 

490307 0.943 0.479: 

27 

137 It. Lout* County 

198213 

201005 

1.003 

0.576 

198402 0.126 0.430 

27 

. 145 liaarm County 

110791 

121193 

1.982 

0.039 

119274 0.405 0.560 

•77 

103 UaaMnften County 

145890 

147150 

0.050 

0.506 

146053 0.108 0 gfe 

20 

047 Karri ton County 

103305 

170273 

2.662 

0.422 

108420 1.018 

20 

049 Ninos County 

254441 

204616 

3.919 

0.4a 

241731 2.7«5 

28 

059 Jackson Conty 

115243 

110271 

2.540 

0.400 

117089 1.574 0.407 

29 

019 loom County 

112379 

115311 

2.543 

0.550 

113420 1.092 0.444 

29 

047 Clay County 

153411 

1547a 

0.063 

0.390 

154290 0.S7S 0.4U 

29 

077 Orson* County 

207949 

211970 

1.097 

0.545 

208941 0.475 0.429 

29 

095 Jackson Court y 

433232 

645000 

1.034 

0.376 

040624 1.154 0.400 

29 

099 Jaff arson County 

171360 

172005 

0.659 

0.510 

171032 0.147 0.504 

29 

103 It. Charts* County 

212907 

215015 

0.980 

0.431 

213851 0.4*2 0.380 

29 

109 It. Louis County 

993529 

10100S 

1.033 

0.450 

*99753 0.623 0.370 

29 

510 It. Louis city 

390005 

400203 

2.036 

0.516 

*05175 2.096 0.082 

30 - 

111 lollowstons County 

- 1.13419 

. 114710 

1.125. 

0.605 

115539 1.835 0.459 

31 

055 Douflos County 

410444 

421910 

1.297 

6.419 

420353 0.930 0.453 

31 

109 Lancsstor County 

213041 

210220 

2.101 

0.411 

215022 0.042 0.420 

31 

153 Sarpy County 

102503 

104050 

1.410 

0.492 

103700 1.154 0.483 

32 

003 Clark County 

741459 

759060 

2.422 

0.518 

750092 2.271 0.521 

32 

031 vashoa County 

254607 

250090 

1.034 

0.550 

261007 2.429 0.510 

33 

Oil Nillsborouiti County 

330073 

335052 

•0.125 

0.570 

330911 0.630 0.500 

33 

013 Nsrrimck County 

120005 

121590 

1.310 

0.034 

120910 0.7a 0.539 

33 

015 locking County 

245045 

246907 

0.454 

0.504 

247556 0.091 0.5a 

XT 

* 0t7 Strafford County 

” *104233 

” 104021 

•0.204 

0.513 

105001 0.007 Q.5S7 

34 

001 Atlantic County 

224327 

227037 

1.541 

o.sa 

224943 1.153 0.374 

34 

003 ftorpon County 

125300 

029201 

0.470 

0.300 

020920 -0.542 0.706 

34 

005 Sur tins ton Couity 

395000 

401239 

1.539 

0.005 

394939 >0.032 0.500 

34 

007- Condon Couvty ■ - 

302024 . 

-310056 

1.416 . 

0.421 

5034*9 0.120 0.719 

34 

•11 CuMosrl and County 

136053 

140210 

1.536 

0.530 

139050 l.ia 0.379 

34 

013 Xaaoa County 

771200 

602200 

2.999 

0.540 

79907* 2.685 0.782 

34 

015 Cloucostor County 

230002 

233020 

1.241 

0.499 

229100 -0.424 0.624 

34 

017 Nudsun County 

553099 

506477 

2.705 

0.577 

50925* 2.839 1.107 

34 

019 Nuntsrdon County 

107776 

107001 

0.079 

0.403 

108451 0.023 0.745 

34 

021 Morcor County 

325024 

33t4a 

1.094 

o.sa 

328447 0.039 0.554 

34 

023 Mtddlossa County 

671700 

677602 

0.071 

o.ia 

472992 0.100 0.712 

|4 

025 Monoouth County 

553124 

550412 

0.591 

0.574 

530005 -0.421 JtoS7 

34 

027 Morrla County 

421353 

425501 

0.975 

0.717 

41913* -0.529 ^BO 

34 

029 Ocsur County 

433203 

433510 

0.072 

0.599 

4*9899 -0.769 

34 

031 fossolc County 

453060 

' 461045 

1.902 

0.541 

459194 1.330 0.851 

34 

035 fatrsat County 

240279 

241009 

0.575 

0.578 

23951* -0.320 0.611 

34 

037 lusts* County 

130943 

132073 

0.050 

0.729 

13121* 0.210 0.331 

34 

039 Union County 

493019. 

503004 

1.024 

0.500 

497433 0.727 0.771 

39 

001 Oomollllo County 

400577 

497433 

3.427 

0.51* 

491*5* 2.293 0.*3i 

39 

013 Dons Ana County 

135510 

141374 

4.2*3 

0.345 

139939 1.105 0.40! 

30 

001 A l fcsny Couity 

292594 

295111 

0.053 

0.530 

2938*9 0.*27 0.4* 

30 

005 Irom county 

1203709 

124507* 

3.370 

0.730 

120570* *.*97 1.41' 

30 

097 Irooao County 

212100 

2125a 

0.1*3 

0.341 

2136*9 0.716 0.431 

30 

013 Chautauqua County 

*41095 

141997 

0.072 

0.323 

1*30*7 0.805 0.33 

30 

027 Out chstt County 

259402 

241192 

0.402 

0.343 

261800 0.890 0.43 

J0 

029 iris County 

908532 

97059* 

0.020 

0.308 

969213 0.070 0.05 

30 

049 Jofforaon County 

110943 

11213* 

1.000 

0.342 

112035 1.503 0.71 

30 # 

0(7 Kinfi Coifnty 

2300004 

23 7909* 

3.329 

0.5*2 

2309150 3.70* 0.90 

30 

055 Monro* County 

713960 

722929 

1.240 

0.334 

71612* 0.301 0.64 

30 

059 Nassau County 

12873a 

1290120 

0.477 

0.371 

1277**9 -0.773 0.82 

30 

001 Maw fork County 

1407536 

1537991 

3.2*1 

0.5*0 

1541441 3 .*97 0.90 

34 

003 Nlatar* County 

220750 

22179* 

0.407 

0.537 

220729 -0.012 0.31 

30 

005 Ontida Couity 

250030 

251005 

0.385 

0.510 

252906 0.819 0.*4 

30 

007 Onondoia County 

460971 

472639 

0.010 

0.532 

469750 0.163 0.63 

30 

071 Oranf* County 

307047 

109752 

0.600 

0.544 

310082 1.O40 0.*1 

30 

075 OswtfO County 

121771 

121070 

0.001 

0.423 

122882 0.904 0.61 

30 

001 Ihiisna County 

1951590 

2004192 

2.024 

0.024 

1992006 2.029 O.W 

30 

0(3 tonssolaor Couity 

154*29 

154995 

0.305 

0.535 

153072 0.415 0.3' 

30 

005 kichannd County 

370977 

364245 

3.371 

0.533 

37*7*2 -0.052 0.7 

30 

007 lackland County 

205475 

2*94*7 

1.540 

0.088 

26*771 -0.266 0.7 

30 

019 It. lawronco Couity 

11197* 

112733 

0.473 

0.504 

113179 1.06* 0.6 

30 

091 loratofa County 

101270 

161400 

0.117 

0.415 

1*1850 0.316 0.3 

30 

093 Utonoctady County 

149205 

H9052 

0.371 

0.524 

1*8509 -0.460 0.7 


162 



103 Suffolk Couity 

132186* 

13307*3 0.667 0.576 

1313346 

-0.6*9 

0.72 

‘ 36 

111 Ulster Comity 

165304 

167147 1.103 0.612 

167385 

1.2*4 

0.73 


119 Westchester Couity 

87*866 

8906*8 1.772 0.641 

879705 

0.530 

0.61 

gk 

001 A 1 usance Covnty 

108213 

111*18 2.877 0.439 

109811 

1.453 

o.«o 


021 lmco*U County 

17*821 

179768 2.75 2 0.463 

177162 

1.321 

0.41 


033 CilMt County 

118*12 

122063 2.991 0.498 

120094 

1.401 

0.42 

37 

DSl Cutter lend Couity 

27(366 

28*189 3.386 0.419 

28060* 

2.152 

0.31 

37 

057 Davidson Couity 

126677 

130309 2.936 0.380 

128544 

1.453 

0.*! 

37 

063 Dvfhau County 

111833 

188378 3.473 0.462 

185785 

2.126 

0.57 

37 

067 foriyth County 

2658 78 

274462 3.128 0.430 

* 270363 

1.639 

o.*t 

37 

071 Cation County 

175093 

17782* 1.536 0.464 

177837 

1.343 

0.45 

37 

081 Out 1 ford County 

3*7420 

338847 3.184 0.443 

353615 

1.732 

0.3( 

37 

Ilf Mecklenburg County 

511433 

$28981 3.317 0.424 

523306 

2.269 

0.5! 

37 

129 Mew Monover Couity 

12028* 

124111 3.08* 0.*38 

122381 

1.714 

0.5* 

37 

133 Onslow Couity 

1*9838 

13*392 2.950 0.37* 

153141 

2.137 

0.41 

37 

147 Mitt County 

10792* 

110732 2.336 0.423 

1 10516 

2.345 

0.5! 

37 

131 lendolph County 

106546 

109790 2.935 0.595 

108009 

1.354 

0.4! 

37 

153 lebeson Couity 

105179 

108097 2.699 0.452 

107475 

2.136 

0.5! 

37 

159 Iowan County 

110605 

111*20 0.732 0.32* 

112305 

1.514 

o.s; 

37 

183 wake County - 

• <23380 

'*38*28 3.432 8.43* 

432630 

2.138 

o.*« 

37 

191 Wayne County 

104666 

107153 2.321 0.401 

106769 

1.969 

0.31 

38 

017 Csss Cauwy 

10287* 

105012 2.036 0.571 

103452 

0.559 

0.44 

39 

003 Allen County 

109755 

111410 1.486 0.310 

110262 

0.460 

0.4 

39 

017 Out let County 

291479 

295537 1.373 0.535 

292902 

0.486 

0.3: 

39 

023 Clark Couity 

1*7548 

149800 1.503 0.S19 

148179 

0.426 

0.41 

39 

023 Clement Couity 

150187 

151277 0.721 0.514 

150784 

0.396 

o.s: 

39 

029 Cel v* Ians County 

108276 

107546 -0.679 0.384 

108375 

0.091 

0.5> 

39 

035 Cuyahoga County 

1412140 

1*29*31 1.210 0.431 

1427932 

1.106 

0.4' 

39 

043 felrfleld Couity 

' 103(61 

103995 0.514 0.427 

103594 

0.129 

o.s: 

39 

049 franklin Couity 

961437 

970249 0.908 0.463 

973539 

1.446 

o.s: 

39 

057 Creene Couity 

136731 

138166 1.039 0.632 

137700 

0.704 

o.s: 

39 

061 m sal l ton Couity 

866228 

176347 1.135 0.424 

876795 

1.205 

0.41 

39 

fl«3 take Couity 

215499.. 

216985 0.685 0.5.19. 

216122 

0.288 

0.3 

39 

089 licking Couity 

128300 

1290*2 0.573 0.432 

128558 

0.201 

0.5 

39 

093 lore in County 

271126 

275982 1.760 0.520 

272668 

0.363 

0.3 

>9 

093 Lucas Couity 

462361 

*65553 0.686 0.477 

467096 

1.014 

0.4 


099 Mahoning Couity 

26*806 

268995 1.557 0.528 

266443 

0.614 

0.3 

A 

103 Medina Couity 

122354 

123157 0.652 0.414 

122484 

0.106 

0.4 


113 Mnt gentry County 

575809 

S 83903 1.729 0.528 

380267 

1.113 

0.4 

39 

133 Mrtage Couity 

1*2583 

144241 1.148 0.S73 

143615 

0.717 

0.5 

39 

139 llcMend Couity 

126137 

127129 1.32* 0.S2Q 

126535 

0.314 

0.4 

39 

131 Start Couity 

567583 

372544 1.331 0.525 

368829 

0.337 

0.3 

39 

133 Suasit County 

51*990 

323958 1.712 0.520 

518979 

0.769 

0.4 

39 

153 ThsttuU County 

227813 

230339 1.097 0.560 

228736 

0.403 

0.3 

39 

165 Warren Couity 

113909 

114657 0.452 0.498 

114158 

0.218 

0.3 

39 

169 weyne Couity 

101*61 

100828 -0.621 0.605 

101745 

0.279 

0.6 

39 

173 Wood Couity 

113269 

113881 0.337 0.446 

113912 

0.365 

0.4 

AO 

027 Cleveland Couity 

17*251 

178292 2.265 0.466 

177*45 

2.020 

0.5 

AO 

031 C— nets Couity 

111*86 

11403 2.915 0.418 

113736 

1.996 

0.3 

AO 

109 Oklahett Couity 

59H11 

613697 2.295 8.419 

.612788 

2.150 

0.3 

AO 

143 Tulsa County 

5033*1 

514637 2.195 0.453 

512955 

1.874 

O.S 

A1 

003 CIkUmi Couity 

278850 

279977 0.403 0.724 

281892 

1.079 

o.< 

41 

029 Jackson Couity 

1*6389 

150125 2.489 '0.537 

149287 

1.941 

0.* 

A1 

039 lane County 

282912 

289415 2.247 0.551 

289266 

2.197 

0.* 

Al 

0*7 Marlon County 

228*83 

234494 2.563 0.508 

233587 

2.185 

0.4 

41 

031 Mwltnenah Couity 

583887 

5980*9 2.368 0.489 

S93788 

1.668 

o.< 

Al 

1)67 WasMngt on couity 

31 153* 

314044 0.793 0.688 

315806 

1.346 

0.1 

A? 

88* Allegheny Couity 

1336*49 

1346520 0.748 0.600 

1331707 

-0.356 

o.; 

A2 

007 leaver Couity 

186093 

186376 0.152 0.593 

185 256 

-0.452 

0.1 

42 

011 lefts Couity 

336523 

3374J4 0.270 0.536 

338569 

0.604 

0.1 

a 

013 llalr Couity 

130542 

130430 -0.086 8.532 

131077 

0.40* 

0.4 

42 

017 lucks Couity 

541174 

545735 0.836 0.726 

537873 

-0.614 

O.l 

42 

019 lut (or Couity 

152011 

133225 8.790 0.660 

152898 

0.579 

0.4 

42 

021 Cattrta Couity 

163029 

162949 -0.049 0.556 

163876 

0.517 

0.< 

42 

027 Centre Couity 

123786 

124397 0.491 0.570 

125635 

1.472 

0. 

42 

029 cuoater Couity 

376396 

380542 1.090 0.704 

377088 

0.184 

0.! 

42 

0*1 Cutter land Couity 

195237 

195365 0.0SS 0.575 

195256 

•8.001 

o.: 

42 

0*3 0o««ain County 

237813 

2(1035 1.337 0.552 

239154 

0.361 

0. 

42 

0*3 Delaware Couity 

547831 

S54009 1.147 0.694 

545064 

•0.473 

0. 


•49 trie County 

273372 

276888 0.475 0.529 

277235 

0.600 

0. 

gk 

031 Fayette County 

143331 

1*5958 0.416 0.742 

146681 

0.907 

0. 

w 

033 franklin Couity 

121082 

122079 0.817 0.632 

122180 

0.899 

0. 



163 


42 

069 t *c k»u»nru County 

219039 

219814 

•0.103 

0.532 

217294 -0.803 0.73? 

42 

071 LWKHltr Cotfil) 

422922 

423976 

0.272 

0.564 

126526 0.8«g 0.S2J ' 

42 

075 Lebanon County 

113744 

113779 

0.031 

0.543 

114518 0.676 0.587 

42 

077 lefcigh County 

291130 

291961 

0.285 

0.515 

289980 -0.396 0.861 

42 

079 Luiome County 

326149 

327769 

-0.116 

0.546 

326439 -0.524 0.593 

42 

091 lyc e-iing County 

116710 

119822 

0.094 

0.S38 

119511 0.670 0.493 

■42 

095 Hereof County 

121003 

121190 

0.154 

0.552 

121627 0.513 0486 

42 

091 Montgomery County ' 

678111 

693019 

0.719 

0.667 

673620 -0.667 ^B 

42 

095 Morthompton County 

247105 

247696 

0.235 

0.527 

246917 -0.076 

42 

101 Philadelphia County 

1595577 

1606249 

1.287 

0.609 

1608942 1.452 0.742 

42 

107 SdniyUlU County 

152595 

153416 

0.542 

6.631 

152999 0.264 0.525 

42 

125 Washington County 

204594 

205463 

0.428 

0.738 

204549 -0.018 0.506 

42 

129 Westmoreland County 

370321 

371539 

0.328 

0.750 

369009 -0.356 0.551 

42 

133 lark County 

- 339574 

340569 

0.292 

0.572 

341321 0.512 0.472 

44 

003 lent County 

161 133 

161499 

6.225 

0.854 

159353 -1.117 0.776 

44 

007 Providence County 

596270 

597016 

0.125 

0.580 

597960 0.283 0.697 

44 

009 Washington County 

110006 

110452 

0.404 

0.638 

110982 0.880 0.633 

49 

003 A Ikon County 

120940 

124770 

3.070 

0.542 

123291 1.907 0.403 

49 

007 Anderson County 

U5196 

149574 

2.927 

0.502 

147268 1.407 0.373 

49 

CIS Borkoloy County 

’ 129776 

133469 

3.515 

0.555 

132081 2.502 0.472 

49 

019 Charleston County 

295039 

304929 

3.212 

0.437 

302751 2.547 0.580 

49 

041 Florence County 

114344 

119062 

3.149 

0.453 

116745 2.056 0.654 

49 

045 Greenville County 

320167 

330290 

3.065 

0.494 

325537 1.650 0.467 

49 

051 worry County 

144053 

147941 

2.562 

0.452 

146650 1.771 0.455 

49 

063 Losing ton County 

167611 

173093 

3.162 

0.583 

170341 1.602 0.375 

49 

079 Blchland County 

265720 

295225 

3.220 

0.421 

293299 2.584 0.544 

49 

093 Sportonburg County 

226900 

233790 

2.990 

0.499 

230614 1.654 0.374 

49 

095 Swot or County 

102637 

105121 

2.363 

0.403 

105017 2.267 0.500 

49 

091 fork County 

131497 

133960 

1.939 

0.454 

133717 1.660 0.409 

44 

099 Minnehaha County 

123909 

126103 

1.619 

0.S78 

124220 0.331 0.442 

42 

037 Davidson County 

5107(4 

5324JJ 

4.066 

0.521 

522044 2.157 0.617 

47 

065 Moil ton County 

285536 

293917 

2.652 

0.442 

290664 1.764 0.512 

47 

” 093 Kn6a COUU9 

335749 - 

345091 

2:704 

0.466- 

341491 1.679 0.502 

47 

125 Horn gome ry County 

100499 

104034 

3.399 

0.463 

102449 1.923 0.519 

47 

149 Autherford County 

119570 

122462 

3.178 

0.466 

120716 1.771 0.511 

47 

157 Shelby County 

626330 

961616 

4.095 

0.432 

947949 2.539 0.58V 

47 

163 Sullivan County 

143596 

146794 

2.179 

0.499 

145270 1.152 0.437 

47 

145 Sooner County 

103291 

105733 

2.319 

0.596 

104756 1.409 0.343 

49 

027 Soil County 

191099 

197377 

3.166 

0.397 

195908 2.410 0.563 

44 

029 Beaar County 

1165394 

1220995 

2.916 

0.499 

1230141 3.639 0^44 

49 

039 Sraioria County 

191707 

196965 

2.670 

0.494 

195577 1.979 

49 

041 Braxos County 

121662 

126396 

3.567 

0.520 

125980 3.192 Bb 

49 

061 Caoeron County 

260120 

269903 

3.625 

0.754 

269459 3.179 0.993 

49 

0BS Collin County 

264036 

271624 

2.794 

0.479 

249149 1.900 0.412 

49 

113 Dallas County 

1852910 

1929904 

3.975 

0.409 

1912100 3.101 0.62C 

49 

121 Denton County 

273525 

202791 

1.277 

0.444 

279483 2.132 0.491 

46 

133 fetor County 

116934 

122713 

3.135 

0.4(1 

121299 i.94v o.sa: 

49 

141 |1 Paso County 

591610 

611279 

3.218 

0.611 

617397 4.177 0.991 

49 

157 fort Bond County 

225421 

233251 

3.337 

0.459 

230752 2-310 0.33i 

49 

147 Galveston County 

217399 

223599 

2.773 

0.399 

221787 1.979 0.49) 

49 

113 Gregg County 

104949 

107799 

2.645 

0.417 

106936 1.960 0.» 

49 

201 Morris County 

2619199 

2939399 

4.123 

0.421 

2913597 3.340 0.63 

49 

215 Hidalgo County 

393545 

399356 

3.959 

0.983 

399991 4.112 0.94 

49 

245 Jefferson County 

239397 

2*6592 

2.911 

9.499 

241776 1.796 0.44 

49 

303 lobbock County 

222636 * 

229952 

3.139 

0.466 

229182 2.430 0.58 

a 

309 Mclennan County 

169123 

194513 

2.791 

0.393 

193547 2.185 0.54 

49 

329 Midland County 

106611 

109999 

3.070 

0.466 

109645 1.872 0.41 

49 

339 Montgomery County 

182201 

196761 

2.442 

0.300 

185687 1.877 0.41 

49 

335 Mutees County 

291143 

299691 

2.949 

0.533 

301959 3.581 0.71 

49 

423 Smith County 

151309 

155316 

2.590 

0.390 

154521 1.952 0.31 

49 

439 ttrrwt County 

1170189 

1212931 

3.523 

0.695 

1200705 2.549 0.5* 

49 

441 layior County 

119653 

123143 

2.933 

0.679 

122112 2-912 0.5; 

a 

453 Travis County 

576407 

594107 

2.979 

0.647 

596444 3.560 0.6< 

49 

479 Webb County 

133239 

139100 

3.376 

0.771 

137205 2.889 1.Z 

49 

495 with! is County 

122379 

125621 

2.592 

0.440 

124508 1.711 0.5! 

49 

491 Williamson County 

139531 

143640 

2.947 

0.503 

142665 2.182 0.3 

49 

011 DonrU County 

187941 

190520 

1.354 

0.7J4 

190068 1.119 0.7 

49 

035 Salt lake County 

725956 

736793 

1.471 

0.635 

735155 1.249 0.6 

49 

049 Utah County 

263390 

269091 

1.971 

0.621 

271102 2.771 0.6 

49 

057 Weber County 

159330 

160566 

1.393 

0.591 

160319 1.240 0.5 

90 

007 CMttondan County 

131761 

132031 

0.205 

0.597 

132975 0.915 0.5 

51 

013 Arlington County 

170936 

179147 

4.046 

0.491 

175566 2.637 0.7 



164 


s, 

041 Chesterfield County 

209274 

216590 

3.378 

0.584 

212658 

1.391 

0.432 

51 

054 leirfes County 

616584 

826402 

0.946 

0.575 

833668 

1.809 

0.501 


067 Nenr ieo County 

217881 

224759 

3.060 

0.546 

221878 

1.801 

0.506 


153 Prince Willie* County 

215686 

218414 

1.249 

0.585 

220359 

2.121 

0.425 

W 

510 klenendrie city 

111183 

112748 

1.388 

0.541 

114451 

2.856 

0.771 

31 

550 Chcsepeeke city 

151976 

153512 

1.001 

0.556 

155185 

2.068 

0.509 

51 

650 Menpton city 

133793 

139284 

3.942 

0.459 

137415 

2.636 

0.617 

31 

700 Newport Mews city 

170045 

178053 

4.498 

0.468 

175121 

2.899 

0.689 

31 

710 Norfolk city 

261229 

273457 

4.472 

0.444 

269011 

2.893 

0.733 

31 

740 Portsmouth city 

103907 

108477 

4.213 

0.474 

106837 

2.742 

0.695 

51 

760 lichnond city 

203056 

209959 

3.288 

0.549 

208987 

2.838 

0.817 

51 

010 Virginie leech city 

393069 

408213 

3.710 

0.487 

402092 

2.244 

0.558 

53 

005 lenten County 

112560 

115161 

2.259 

0.556 

'15073 

2.184 

0.44* 

53 

011 Clerk Couity 

- 238053 

245741 

3.129 

0.355 

241186 

1.299 

0.533 

53 

033 King County 

1507319 

1536441 

1.895 

0.519 

1531673 

1.590 

0.612 

53 

035 Kittep County 

189731 

196029 

3.213 

0.531 

193702 

2.050 

0.425 

S3 

053 Pierce County 

586203 

607187 

3.456 

0.502 

597344 

1.865 

0.541 

S3 

Ml Snoho*ish County 

465642 

470715 

1.078 

0.625 

471683 

1.281 

0.537 

33 

063 Spoken* County 

361364 

370081 

2.353 

0.339 

365976 

1.260 

0.377 

S3 

067 Thurston County * 

161238 

166421 

3.114 

0.542 

164464 

1.962 

0.425 

53 

073 Whet com County 

127780 

131437 

2.782 

0.532 

130903 

2.386 

0.487 

33- 

077 TskiSM County 

188823 

196444 

3.880 

0.499 

195170 

3.252 

0.557 

54 

039 Keneuhe County 

207619 

213488 

2.749 

0.492 

210468 

1.354 

0.443 

55 

009 Iroten County 

194594 

197594 

1.518 

0.540 

195417 

0.421 

0.428 

53 

023 Done County 

367085 

373810 

1.799 

0.341 

370065 

0.805 

0.441 

55 

059 Kenoshe County 

128181 

130580 

1.837 

0.548 

128869 

0.534 

0.392 

33 

073 Here then County 

115400 

116699 

1.113 

0.335 

115646 

0.213 

0.518 

35 

079 Milweukee County 

959275 

969129 

1.037 

0.459 

975296 

1.643 

0.590 

55 

087 Out age* ie County 

140510 

112519 

1.410 

0.543 

H1059 

0.390 

0.428 

55 

101 * seine County 

175034 

178398 

1.886 

0.522 

176209 

0.667 

0.368 

55 

105 lock County 

139510 

H1935 

1.709 

0.558 

140129 

0.441 

0.395 

33 

117 Unetooygsn County 

103877 

105288 

1.340 

0.537 

104218 

0.127 

0.445 

35 

133 weukeshe -County 

• • - 30471* • 

• 306312 

0.521 

0.434 

305387 

0.220 

0.361 

33 

139 wirvnetugo County 

140320 

142464 

1.505 

0.549 

K085S 

0.380 

0.41! 



165 


Mr. Miller. Thank you all very much for being here today. We 
appreciate the expertise that all three of you bring. 

Let me start with a couple of questions. Let me just go on record; 
all of you — and correct me if I’m wrong — believe that sampling was 
a failure in 1990? If they use sampling and adjustment, that would 
have been worse than the full enumeration that was used; is that 
correct? 

Mr. Darga. Yes. 

Mr. Coffey. Yes. 

Mr. Stark. Yes; statistics means never having to say you’re cer- 
tain. [Laughter.] 

I’m almost certain. [Laughter.] 

Mr. Miller. From what you know about the 2000 plan, what is 
your belief that it will be a success — that we’ll have a miraculous 
census? 

Mr. Stark. I don’t see anything in the proposal for 2000 that 
would alleviate what I see the primary problems of what was done 
in 1990. 

Using a larger sample size, in some sense, to stretch this rifle 
analogy further, is like having a more accurate rifle. If you don’t 
sight it in, you might be able to hit the same spot every time, but 
that’s not going to be the bull’s eye. The problem is the bias. What 
increasing the sample size might do is decrease the scatter, but 
that doesn’t make your shot land closer to the bull’s eye, nec- 
essarily. 

Mr. COFFEY. Bias problems, generally, are quite resistant to 
changes in sample size. Typically, when you’re taught about bias 
in your first course in statistics, it’s that constant term that sits 
out there after you divide by “n,” as you may know. As I indicated 
in some of my comments toward the end of the oral statement, and 
also in my written statement, my best guess is that some of these 
bias problems are actually going to increase. 

I am even more concerned that some of the auxiliary information 
that has always been available for you to, at least, catch some of 
the errors you’ve created and possibly fix them, is going to be sub- 
merged in just a sea of noise of lots of new kinds. 

Mr. Darga. See, I’m also very concerned about the plans for sam- 
pling in the next census. In 1990, it would have resulted in disas- 
trous inaccuracies at the local level and major inaccuracies even at 
the national level, and I’m concerned about the same problem in 
the year 2000. 

It’s not just due to small problems in execution, if the Census 
Bureau tries a little bit harder they can do a better job, but it’s due 
to the impossibility of accomplishing the task by that particular 
method. 

Mr. Miller. The people that are opposed to sampling have been 
referred to as Luddites, in that they oppose modern approaches to 
using new technology and such. How would each of you respond to 
that idea, that it was actually even Mr. Sawyer’s comment? And is 
there kind of a myth out there that it is no longer possible to count 
everybody, and we should just give up and use modem techniques. 
How do you respond to that? 

Mr. Stark. Well, accepting the premise that sampling is a new 
idea — [laughter] — not every new idea is good, and no good idea 



166 


solves every problem, and this is a bit like taking a yardstick that’s 
2 inches too short and trying to do a job that you need a microm- 
eter to do. The bias is large, the instrument is crude. What we need 
is not a low-tech solution; we need a better count. 

Mr. Miller. How about the fact that many of your colleagues — 
certainly not all of them — [laughter] — in the statistical community, 
academic community, believe the only solution is sampling. How do 
you address that issue? 

Mr. Stark. I take it you’re referring, for example, to the National 
Academy of Sciences reports. 

Mr. Miller. Maybe that one or the American Statistical Associa- 
tion. 

Mr. Stark. Well, when I try to come to a scientific conclusion, 
I’ve been trained to place more weight on the evidence than on the 
letterhead that the evidence is on. I read the National Academy re- 
ports; I found the evidence quite weak. There are no data on which 
the conclusions are based. There is no mathematical theory on 
which the conclusions are based. The recommendation of sampling 
in National Academy reports seems to be based on loose analogies, 
to very idealized circumstances under which sampling would be a 
good idea. 

Mr. Miller. Does anyone else wish to comment about, as far as 
the perception is out there, that everybody is a Luddite if they 
don’t believe in sampling or that because of the American Statis- 
tical Association statisticians professionally think this is the only 
solution? 

Mr. COFFEY. I’ve probably seen more bad samples than every- 
body else sitting at the table. [Laughter.] 

Lots of them came across my desk every day at OMB. Sampling 
is not the solution to every problem, and frequently, the kind of dif- 
ficulty we had was moving an agency toward an effective sampling 
strategy and away from a poor one. 

There really are no analogs to the task that the Census Bureau 
does in performing the enumeration. You can’t draw analogs, anal- 
ogies, with other kinds of statistical work. It is a unique task in 
the whole world. It’s the largest data collection that anybody does. 
You really have to think through each problem on its merits; look 
for the solutions to that problem; and avoid the trap of trying to 
draw simple analogies with other kinds of problems that aren’t 
really comparable. 

Mr. Miller. Thank you. 

Mr. Darga. I see the question is not really being in favor of sam- 
pling or being opposed to sampling, but evaluating a particular 
methodology for using sampling to address a particular problem. I 
can’t say that I am opposed to sampling. That would be kind of like 
a carpenter saying that he’s opposed to hammers. That doesn’t 
mean that sampling should be used for every purpose in any man- 
ner. Now the particular methodology that’s been proposed for using 
sampling in the census is a very seriously flawed methodology. 

Thank you. 

Mr. Miller. Thank you all very much. 

Mrs. Maloney. Thank you. I’d like to ask each of you, despite 
the fact that the Census Bureau made improving the count among 
minorities a major goal of the 1990 census, the 4.4 percent differen- 



167 


tial in the 1990 undercount between blacks and non-blacks was the 
highest ever recorded. Experts have repeatedly said that spending 
more money on traditional methods will not reduce this differen- 
tial. If not through statistics, how do you propose to reduce this dif- 
ferential Dr. Stark? 

Mr. Stark. I think — I’m obviously not an expert on policy — but 
I think the way to get more accurate counts would be to get a high- 
er response rate, and that is a policy issue. What can be done to 
encourage a higher fraction of people, especially people for whom 
the undercount appears to be differential, to respond? One issue — 
one question is where the 4.4 percent number is coming from, be- 
cause there are uncertainties in the demographic estimates and 
there are also, obviously, uncertainties in the sampling-based esti- 
mates. So, how one gets at ground truth is really rather touchy, 
and how one comes up with this figure is really rather difficult. 

I deeply believe that the capture — recapture survey-based meth- 
ods like the PES or the ICM add more error than they fix, and I 
don’t believe that they’re the solution. I don’t necessarily have a 
better idea on the statistical side. I think the way to improve 
things is to count better. 

Mrs. Maloney. Would you like to respond, Dr. Coffey, or Mr. 
Darga? I mean, the undercount is undisputed, and it’s growing. So, 
if you don’t use statistics, what do you do to correct it? 

Mr. Coffey. I’d have to reserve judgment on both those scores. 
The undercount in 1990 is 2.1 percent, or 1.8 percent, or 1.6 per- 
cent, or 1.2 percent, or 0.9 percent, or 0.6 of a percent. Frankly, I 
have 

Mrs. Maloney. But it’s there. 

Mr. Coffey [continuing]. Difficulty choosing. [Laughter.] 

There may be something there. [Laughter.] 

I would certainly expect that with the kind of evidence you see, 
there is something there having a factor of three difference be- 
tween the lowest and the highest. It does not leave me with a sense 
of comfort that I can draw conclusions, certainly not the same con- 
clusions, if I’d believe 2.1 as if I’d believe 0.6. Think about the 0.6; 
if a substantial portion of the real undercount is unobserved, you 
are then looking at the attributes. You’re measuring the attributes 
of a modest fraction of the uncounted population. To the extent 
that those are consistent with the kinds of analysis that you get 
out of demographic analysis. You probably have some reason for 
thinking that’s something you should believe. 

On the other hand, if you look at what the committee did — and 
these were very sharp people — it reached the point where they are 
starting to question some of the possible errors in demographic 
analysis — looking for sources of error to bring it down closer to the 
lower figure. 

The one thing that I came away with from reading that report 
was that I am not at all sure what the level is. I am not at all sure 
which people — I have less confidence in the policies aimed at cap- 
turing identified portions of the undercount in the absence of sure 
analysis of what the attributes of this uncounted population are. 
Basically, if somebody came along and said, you know, “I’ve got a 
strategy for going after 10 different subgroups.” I’d say that’s a bet- 
ter strategy than going after four, because I can’t be sure which 



168 


groups are really out there. I don’t like the idea of trying to find 
two in a policy that may be based on inaccurate information. To 
the extent that you can go after or implement policies that will im- 
prove response generally, that will, or can, go after groups that you 
think may be — on whatever evidence 

Mrs. Maloney. We have roughly a 60-year history of higher 
undercounts for African-Americans. Are you suggesting that we do 
nothing? 

Mr. Coffey. No, not at all. I’ve 

Mrs. Maloney. And it’s not just the undercount, it’s the 
overcount. The Census Bureau says 10 million undercounted, 6 
million overcounted; that’s a 16 million error. Are you suggesting 
we do nothing? It’s undisputed that there is a large undercount for 
African-Americans. What do you suggest we do about it? 

Mr. Coffey. Well, the methodology that was being used in the 
PES, precisely, leads you to that conclusion. It does dispute the size 
of the undercount in demographic analysis. 

Mrs. Maloney. Well, for 60 years they’ve been 

Mr. Coffey. I believe 

Mrs. Maloney [continuing]. Talking about it, reporting on it. 

Mr. Coffey. I believe demographic analysis, like most of the ex- 
perts, is more reliable which makes me fall down on the side of, 
well, this other newer methodology probably isn’t working as well. 

Mrs. Maloney. Well, let’s hear from the demographer. 

Mr. Darga. OK. [Laughter.] 

Mrs. Maloney. Again, we have an undercount. What do you sug- 
gest we do about it? 

Mr. Darga. Well, there are two things that we could do about it. 
The first is to do the best possible census that we can. And the sec- 
ond, since we obviously don’t know how to do a valid adjustment 
for undercount in the census — the Census Bureau also, in addition 
to conducting the census, prepares population estimates and popu- 
lation projections. 

It may be possible, at least at some levels of geography, to do a 
valid estimate of population undercount through the Estimates 
Program. Not to do an adjustment to the census, and not at the 
smallest local level where an adjustment would destroy the value 
of the census, and the validity of the data — but to do an estimate 
only at those levels of geography where the estimate makes sense. 

Mrs. Maloney. Well, I have a series of more questions, but my 
time is up. At this point, I send out a second round, or not? 

Mr. Miller. I’m not sure we’ll have time for a second round, but 
we’ll certainly be able to submit questions and ask you to respond, 
if you would, because we do have a third panel. I think we’ll have 
time to complete it. 

We’ll go in the order that everyone arrived, so we’ll go with Mr. 
Snowbarger next. 

Mr. Snowbarger. First of all, Mr. Darga, it’s my understanding 
from your paper, that one of the ratios that we know is the ratio 
of male-to-female children. That’s one of the most stable demo- 
graphic statistics that we have, and I think that’s 51 percent boys 
to 49 percent girls. You mentioned that the adjusted census counts 
are far from the norm, and that doesn’t quite make sense to me. 
Could you explain that in further detail? 



169 


Mr. DARGA. OK. When I was reviewing the undercount adjust- 
ments, one of the things I noticed was that there were a number 
of areas of the country, and subgroups of the population, for which 
the adjustment for boys under the age of 10 was very different 
from the adjustment for girls under the age of 10. On page 12 of 
my paper, I list 18 segments of the population for which the dis- 
crepancy was over 10 percentage points. In demography, a fraction 
of a percentage point is often a big deal, but here, we’re talking 
about a 10 percentage point difference in the size of these popu- 
lations that would result from the undercount adjustment. 

So, I saw that this was an opportunity to evaluate whether the 
adjustment was valid or not. If these areas of the country really did 
have such terrible differential undercounts between boys and girls, 
we should be able to look at the census data for these areas, and 
we’ll see that they don’t follow this 51 to 49 ratio that we find ev- 
erywhere else in the country. 

On page 14 of my paper, I have a table that shows 51 percent 
boys for every race and for every region, with very small variation 
among metropolitan areas. And then, if you look at these areas for 
which the Post Enumeration Survey suggested huge differential 
undercounts for boys and girls, we find 51 percent, 52 percent, 51, 
51, 51, 51, 51, 51, 51. There was no differential undercount of boys 
and girls in these regions. These regions are right in line with 
every place else in the country. And yet, the Post Enumeration 
Survey didn’t suggest 51, 51, 51. It suggested 49, 55, 56, 48, 53, 
49, 48, 54, 48 — wild variation in the data after you adjust it. 

Mr. Snowbarger. Does this go to the bias that we’ve been talk- 
ing about? Is that a potential answer to that? I’m going to give all 
of you a little bit more time to explain this in layman’s terms, what 
bias is all about and, hopefully, you can be a little bit more specific 
about the bias that you saw in the 1990 Post Enumeration Survey. 
Dr. Stark. 

Mr. Stark. Just heuristically, bias means that if you measure 
things repeated times, your measurements tend to be all off in the 
same direction. It’s like using a ruler that’s too short, using a 
watch that runs fast, something like that. Yes, there’s going to be 
some fluctuation in how long it takes you to get to work when you 
measure it on your watch, but you’re always going to think it takes 
longer than it really does if your watch runs fast. 

There are a number of sources of bias in these adjustments to 
the census. Many of them involve the models that are the premise 
for this Dual System Estimator that’s been mentioned before, the 
idea that you catch people one time with the census, you catch 
them again with the survey. You look at the overlap; you look at 
the number that was missed, one compared to the other; and you 
infer how many neither of them saw. That’s the fourth cell that Dr. 
Coffey was talking about. 

In order to infer what’s going on in that fourth cell, you have to 
assume a very special relationship between what’s going on in the 
three cells that you do see and the fourth cell that you don’t see. 
We know that that relationship doesn’t really hold, and as a result 
of that, you get bias in the estimates. 

Now the Census Department, the CAPE report, and other people 
have tried to look at the bias that results from that relationship 



170 


not holding, as well as bias from other things. One of the issues — 
a very difficult issue, and a very critical one, is what do you do 
with the unresolved cases? You’ve tried to match up the census 
with the Post Enumeration Survey — I’m sorry, I’m using your time 
here — you do followup on some cases, and there’s some cases where 
you simply don’t know whether there’s a match or not. It turns out 
that, depending on what you do with the cases that you were un- 
able to resolve, you can turn the adjustment, the estimate of 
undercount, from an undercount of 9 million people to an overcount 
of 1 million people, depending on what you do with these people 
whom you just don’t know what’s going on for. Now there’s a model 
that’s used to infer how you want to classify those people whom 
you can’t classify. And the result depends crucially on what you do 
there. 

Mr. Snowbarger. Mr. Chairman, I know my time has expired. 
Could we allow the other two to, maybe, quickly answer the same 
question? 

Mr. Miller. If they’re short answers because we are running out 
of time. 

Mr. Darga. You asked if it’s due to bias. It’s due to a combina- 
tion of sampling error and measurement error or bias. Sampling 
error is where, the experience of the area selected for the sample 
isn’t totally representative of what’s happening in the broader area, 
and measurement error is where you don’t even get an accurate 
measurement of what’s happening within your sample. So, it’s a 
combination of sampling error and non-sampling measurement 
error. 

Mr. Snowbarger. Thank you, Mr. Chairman. 

Mr. Miller. Thank you. Mr. Davis. 

Mr. Davis of Virginia. Let me just say from my perspective, if 
you’re concerned about overcounts in one group or another, the one 
thing you don’t want to do is replace one set of inaccuracies with 
another set of inaccuracies, and I think that’s our concern here. We 
do know that there are ways to focus on a better count, and that 
there are some very tangible things that we can do to bring it up 
if we’ll give it the appropriate focus. 

I’ve got a few questions for Mr. Darga. I wonder if you could ex- 
plain, in laymen’s terms, the problems that the Bureau faces in 
matching census data to the PES data. You mentioned in your 
opening statement that the Census Bureau listed six serious 
sources of error that caused sample results to be unreliable. I won- 
der if you could list those again? 

Mr. Darga. OK. They were survey-matching error, fabrication of 
interviews, ambiguity or misreporting of usual residence, geo-cod- 
ing errors, unreliable interviews, and unresolvable cases. 

Mr. Davis of Virginia. What about matching errors? Can you ex- 
plain what 

Mr. Darga. OK. 

Mr. Davis of Virginia [continuing]. They are? 

Mr. Darga. OK. Well, the Census Bureau first does the census, 
then they do the Post Enumeration Survey. Then, to find out who 
was missed in the census, they match the results of those two sur- 
veys person by person. And if you make a mistake in matching, if 
you really do have the same person in both surveys, but you fail 



171 


to match them — you don’t realize that you have them in both sur- 
veys — then you’re identifying undercount where there is no 
undercount. 

And it turns out that matching is an extremely difficult thing to 
do. Now, perhaps if we had a Social Security number in the census 
we’d be able to match accurately. But what we have is a name pro- 
vided sometimes by a householder, and sometimes by proxy inter- 
views with neighbors or landlords. It was particularly a problem in 
the Post Enumeration Survey when data was obtained from neigh- 
bors and landlords. So, it’s very difficult to just look at the data 
and find out if it matches, and the Census Bureau demonstrated 
that in their evaluation of the Post Enumeration Survey. 

Their method was to first have the match done by a computer, 
which took care of 75 percent of the cases — and we know that com- 
puters don’t make mistakes, so we can assume that this 75 percent 
was 100 percent correct. But then, the remainder went to two inde- 
pendent teams. And the really disconcerting thing is that these two 
teams, using exactly the same data, and exactly the same guide- 
lines, couldn’t agree on who matched and who didn’t. Of those clas- 
sified as matched by the first team, 5.7 percent were classified as 
not matched by the second team. And another 4.5 percent were 
classified as unresolved, so that’s 10 percent. 

Now, if you realize that we are trying to measure an undercount 
of only 1.8 percent of the population, according to demographic 
analysis, having the data be so hard to match that you can’t even 
agree on the match status of 10 percent of the cases that come be- 
fore you, demonstrates that 

Mr. Davis of Virginia. You can be much more inaccurate 

Mr. Darga. Right. 

Mr. Davis of Virginia, [continuing]. From the sample then you 
would with just a regular count. 

Mr. Darga. That fact alone. The discrepancy between these two 
teams of matchers, by itself, invalidates any adjustment based on 
the Post Enumeration Survey. 

Mr. Davis. So, some of the challenges involved in matching 
records between the actual enumerations and the sample census? 

Mr. Darga. OK, let’s see. On page 9 of my paper, I have a whole 
page that lists some of those problems. Things like similarity of 
names, or the use of aliases in either the census or the Post Enu- 
meration Survey, can make it difficult to match people. People who 
move are an extremely serious problem. Mr. Sawyer indicated that, 
we are a mobile society; people have multiple residences. That 
makes it hard to do a census, but it makes it impossible to do a 
Post Enumeration Survey and then match it with near 100 percent 
accuracy against the census. Those trends in our society are even 
more problematic for the adjustment methodology than they are for 
the census itself. 

We heard discussion, also, of proxy interviews; 4.3 percent of the 
cases in the Post Enumeration Survey were proxy interviews — 
interviews with neighbors, or landlords, or other people in the 
neighborhood. For a census, you at least have a mail-out/mail-back 
methodology so that people on the go can at least turn in their cen- 
sus form. A Post Enumeration Survey is entirely interviewer-based, 
so it has a high level of proxy data. And when you try to match 



172 


the data that you gave in the census with the data that your neigh- 
bor gave in the Post Enumeration Survey, it’s not going to be easy 
to see. Is Mindy the same person as Mandy? Is this 23-year old the 
same person as that 28-year old? It’s extremely, extremely difficult 
to do a match with 100 percent accuracy. 

Mr. Davis of Virginia. I think we get the picture now. OK, thank 
you. I think my time’s up. 

Mr. Miller. Thank you very much. Mr. Shadegg. 

Mr. Shadegg. Thank you, Mr. Chairman. Let me begin with 
what I understand to be at least my articulation of a laymen’s un- 
derstanding of part of the problems with the Post Enumeration 
Survey and the projection of its results on populations that you 
didn’t actually count. Any of you can comment on this description 
and tell me if it’s accurate or inaccurate. 

As I understand it, there are two problems. One, you make an 
assumption of a relationship between the people you are surveying 
and the people you are projecting that survey onto, and that as- 
sumption may or may not be accurate. There are inaccuracies in 
and of itself. Second, you base that assumption or that projection, 
based on that assumption, on the actual count of the first popu- 
lation, and there may be errors in it. So, you really have two 
compounding errors when you try to project from a known to an 
unknown. Is that right? Is that a fair description? 

Mr. Stark. At least two. 

Mr. Coffey. Wouldn’t — if there were only two? [Laughter.] 

Mr. SHADEGG. There are at least two that I’ve been able to ab- 
sorb. [Laughter.] 

Mr. Darga, let me focus on a point you raised, because it’s a 
point of grave concern to me. You said that at some levels of geog- 
raphy, you might be able to use a sampling method to try to make 
an approximation of numbers. But, the problem is that when you 
try to do sampling at the very small geographical areas which are 
the focus of the census, that is when we get down to, for example, 
redistricting or reapportionment or to allocation of Government re- 
sources based on the boundary line between several small towns. 
At that level, small geographical errors, is a particular problem 
with trying to use sampling methods to project accurate numbers. 

Mr. Darga. I don’t know of a way to use sampling to estimate 
undercount, even at the national level. 

Mr. Shadegg. OK. 

Mr. Darga. I think that an estimate of undercount at the na- 
tional level would be better based on demographic analysis. 

Mr. Shadegg. So even at that level for undercount, you think it’s 
not wise? 

Mr. Darga. Right. Given the fact that a majority of the people 
identified as undercounted by the Post Enumeration Survey really 
aren’t undercounted. Given that fact, we really need to reassess a 
lot of what we think we know about undercount. 

Mr. Shadegg. That’s a message I’ve gotten here pretty consist- 
ently today. 

Dr. Coffey, in your opinion, what was the single biggest problem 
with the 1990 survey, itself? 

Mr. Coffey. I was not — I had no firsthand experience with this. 
But looking at the report, itself, I would have to agree with my col- 



173 


league here that a lot of the difficult problems involve the depend- 
ence of that methodology on — a very sensitive dependence — on 
matching correctly. There appeared to be a number of issues raised 
by the expert panel which really seem to revolve around whether 
or not you were able to do a good match, whether you were prop- 
erly — and it is critical for this four cell approach, that your match- 
ing determinations tells you which cells its in. If you get that 
wrong, the whole thing begins to come apart. So, I suspect that 
that might have been the largest problem with that particular ap- 
proach. 

Mr. Shadegg. Of the various problems that you’ve identified, are 
any of them solved by the proposed method for 2000? Or, do you 
believe they are susceptible of solution? 

Mr. Coffey. As I said in my prepared statement, I don’t think 
increasing the sample size is good truncating followup so that you 
get a much larger uncounted group are not going to be any good 
to solve the real problem that was identified in that 1992 evalua- 
tion. It may look a little better, but the core of bias from the tough 
cases will still be there and, in some respects, some of the problems 
are going to scale right up with the larger 

Mr. Miller. Speak into the microphone if you would, please. 

Mr. Coffey. Some of the problems are going to scale right up 
with the larger tasks that you’ve created under the 2000 plan. The 
matching problem — you’re now going to have a lot more matching 
to do. One, you’re going to have a much larger sample. You’re going 
to have more cases where, what happens if you try to match some- 
one who happened to have been in that 10 percent that you decided 
not to enumerate? How do draw an inference there? 

There are going to be much larger problems. And some of these 
are going to actually inflate the absolute level of bias over what the 
expert panel and the census committee found after the 1992 eval- 
uation. 

Mr. SHADEGG. One of the straightforward difficulties that I’ve 
had with the proposed sampling — and I’ve had constituents ask me 
about it at home — is, well, if we are going to do an, quote, unquote, 
“actual count,” an actual enumeration of either 88 percent or 90 
percent of the total population, and then we’re going to sample, to 
decide the remaining 10 to 12 percent, “Well, how do we know that 
we got the 88 percent of the total or 90 percent of the total if we 
don’t know what the total is?” Do any of you have an answer or 
response to that particular dilemma? 

Mr. Stark. You don’t know. 

Mr. Darga. Well, as I understand it, the source of that informa- 
tion is vacancy reports from letter carriers. [Laughter.] 

Mr. Shadegg. Great. [Laughter.] 

Mr. Darga. In the Census Bureau’s preliminary testing, they 
found roughly, a 40 percent, disagreement between what the letter 
carriers reported as vacancy and what their own enumerators 
checking the letter carriers found reports. So, I’m very concerned 
that the information being used for that may not be accurate. 

Mr. Miller. Thank you. 

Mr. Shadegg. Can either of you other gentleman give me any as- 
surance on that issue? Or do you share my concern? 

Mr. Stark. I share your concern. 



174 


Mr. Coffey. Yes. 

Mr. Shadegg. Thank you, Mr. Chairman. 

Mr. Miller. Thank you. Let me thank the three witnesses here 
today: Dr. Stark, coming from Berkeley, CA; Mr. Darga, coming 
from Lansing, MI; and Dr. Coffey, coming from 

Mr. Coffey. Nearby Fairfax County, VA. [Laughter.] 

Mr. Miller. Fairfax County, VA — [laughter] — but with 30 years 
of experience within the statistical work in the Government. 

We thank you very much, and you’ve provided a lot of help and 
insight into the problems we are facing. Thank you very much for 
being here. 

[Followup questions and responses follow:] 



175 


IN BURTON. INOIANA 


RANKING MINORITY MEMBER 


BENJAAIWA GltAIAN NEW YORK 
J. DENNIS HAStERT. ILLINOIS 
CONSTANCE A. MORELLA. MARYLAND 
CHRISTOPHER SMAYS, CONNECTICUT 
STEVIV SCmFF lew MEXICO 
CHRISTOPHER COX. CAL IE OR MA 
ILEANA ROSEEHTlNEN. FLORIDA 
JOHN U McHUGH. NEW YORK 
STEPHEN MORN. CALIFORNIA 

THOMAS U DAVK HI. VIRGINLA 

DAVID M MCINTOSH. INOIANA 

MARK E SOOOER .INDIANA 

JOE SCARBOROUGH, FLORIDA 

john shade gg Arizona 

STEVE C UTOURETTE, OHIO 

MARSHALL -MARK- SANFORD SOUTH CAROLINA 

JOHN t SUNUNLi NEW HAMPSHIRE 

PETE SESSIONS. TEXAS 

MIKE PAPPAS. NEW JERSEY 

VINCE SNOWDAROER. KANSAS 

BOB BARR. GEORGIA 

ROB PORTMAN, OHIO 


ONE HONORED FIFTH CONGRESS 

Congress of tfie United States 

fcousf of iRtpresmtatibes 

COMMITTEE ON GOVERNMENT REFORM AND OVERSIGHT 
2157 Rayburn House Office Building 
Washington, DC 20515-6143 



TOM LANTOS CALIFORNIA 
BOB WISE, WESTVnOMU 
MAJOR R. OWENS. NEW YORK 
EOOLPHUS TOWNS. NEW YORK 
PAU. E KANJQM8KI. PENNSYLVANIA 
GARY A CONOD. CALIFORNIA 
CAROLYN B. MALONEY. ICW YORK 
THOMAS M. BARRETT . WISCONSIN 
ELEANOR HOLMES NORTON. 

DISTRICT Of COLUAMU 
CHULA FATTAH. IE HNS YL VANIA 
ELIJAH E CUMMINGS. MARYLAND 
DENNIS KUCPRCH. OHIO 
ROD R BLAGOJEWCH. LLMCMS 
DANNY K DAWS EL IHOtS 
JOHN F TIERNEY. MASSACHUSETTS 
JIM TURNER. TEXAS 

THOMAS H. ALLEN. MAINE 
HAROLD E FORO.JK-. TENNESSEE 


BERNARD SANDERS. VERMONT 
IMPENDENT 


May 15, 1998 


Mr. Jenry L. Coffey, Ph.D. 

91 19 Tetterton Ave. 

Vienna, VA 22182 

Dear Mr. Coffey, 

Thank you for testifying before the Government Reform and Oversight 
Subcommittee on the Census on May 5, 1998. Because of time constraints, I was left 
with a number of questions unanswered. Therefore, I request that you answer the 
following questions: 

1 . In your testimony you indicated some reservations about the “ground rules” of the 
CAPE Committee’s evaluation. What were these reservations? 

2. You stated that the 1 .58% revised estimate of the 1990 undercount released by the 
Census Bureau still contained bias and that using the Census Bureau estimate of 
measured bias and their assumptions concerning the offsetting “correlation bias” from the 
CAPE report, their estimate not of bias was 1.2% of undercount. How does this compare 
to the similar figure attributed to the 1980 Census? 

3. From your experience as a professional statistician, to what extent are errors 
experienced in an actual enumeration likely to appear in a sample as well? 

4. In response to a question, you indicated that you had seen a lot of “bad” sampling. 

Can you elaborate? How do you determine when sampling is appropriate or effective? 

5. In response to a question you implied a relationship between errors in measuring 
undercount and strategies for reducing the differential undercount. How does the former 
affect the latter? 

My questions and answers will be part of the permanent record of the May 5, 1998 
hearing. Again thank you for you insight into this important process. 



176 


OAM BURTON. MXAMA 


HENRY A WAXMAN.CAUFORMA 


0ENJAAPN A. CUUK NEW YOf* 

J MWH5 HASTERT. H.1MAS 
CONSTANCE A- MOBUiA. MARYLAND 
CHR&TOPfCA SHAV3, COM*CTICUT 

steven act** , hew Mexico 

CHRISTOPHER COX. CALFORNIA 

1EANA ROSLEHTMK. FLCROA 

JOHN U McHOGH. HEW YWW 

STEPHEN HOfH. CAUFORMA 

JMILHCAFICMA 

THOMAS M.MVKM.VIRGMM 

DAVE) H MCINTOSH. KaANA 

MARK E SOUOER. NUAHA 

JOE SCAROOMOUOK RA»*iA 

»M SHAOEOO. AMZONA 

STEVE C LaTOURETTE. OHIO 

MAASHIAL**URK- SANFORD. SOI/TH CAROLMA 

_A»*J E SUNUNU, NEW HAMPSWP* 

PETE SESSKHve, TEXAS 
M PAPPAS. l«w JERSEY 
VMCE SNOMOARQER. KANSAS 
OOP BARR. GEORGIA 
NOB PORTMNL OMO 


ONE HUNDRED FIFTH CONGRESS 


Congress of tfjc ©niteb States 

^ouae of JUpresentatibetf 

COMMITTEE ON GOVERNMENT REFORM AND OVERSIGHT 
2157 Rayburn House Office Building 
Washington, DC 20515-6143 



TOM kANTOS. CAUFORMA 
BCS WISE. WEST WRQMA 
MAJOR ft OWENS. NEW YORK 
EOOLPMJS TOWNS. HEW YO*« 
PAH. E HANJORSn. PENNSYLVANIA 
CLARY A. CONOIT. CALFOfMA 
CAROLYN B. INU.0MEY. NEW YORK 
TOMAS M. BARRETT, IWSCONSW 
ELEANOR HOLMES NORTON. 

OSTMCT OP OOUJMEA 
CHAKA f ATTAR PENNSYLVANIA 
ELIJAH E CUAMNOS. MARYLAND 

DIMES KUCEBCR. OHO 
noo R SLAOOJEVKX. ljjmou 

OANNYK. OAVTS. AUNOW 
JOMiF TCf»«Y. MASSACHUSETTS 
JMTUflfCR. TEXAS 
THOMAS R ALLEN. MAME 
HAROLD E FORD JR. TENNESSEE 


BERNARO SANDERS. VERMONT 
WOEPENOENT 


Sincerely, 

Dan Miller 
Chairman 

Subcommittee on the Census 
cc: The Honorable Carolyn B. Maloney 



177 


Responses to Chairman Miller -- 

1. In your testimony you indicated some reservations about the "ground rules" of the 
CAPE evaluation. What were these reservations? 

A. I have two reservations, both concerning the standards of accuracy being applied. 

First, there is the emphasis on "state-level" accuracy. It is true that accuracy of the state totals 
alone would permit the Census Bureau to provide information that discharges its obligation for 
supporting apportionment (assigning a number of seats to each state). But this disregards 
(irresponsibly in my opinion) the need for accurate substate data with which to actually construct 
congressional districts. Important rights are at stake in the accuracy of this process and there is 
no other data base that will serve. 

Second, even the standard of state-level accuracy is treated as an "average" standard. The CAPE 
expert panel found that adjusted totals looked all right on the average, with some exceptions. In 
other words the biases (most of the problem) were judged to be acceptable in most cases. This 
disregards the concept of individual fairness (discussed at length in my response to Maloney #2). 
Given what is at stake, one must question whether individual states or voters would (or should 
have to) live with this kind of judgment. 


2. You stated that the 1.58% revised estimate of the 1990 undercount released by the 
Census Bureau still contained bias and that using the Census Bureau estimate of measured 
bias and their assumptions concerning the offsetting "correlation bias" from the CAPE 
report, their estimate not (sic) of bias was 1.2% undercount. How does this compare to the 
similar figure attributed to the 1980 Census? 

A. I believe the phrase I used was "net of bias." The published estimate of 1.58% is the sum of 
measured undercount (the target group) and measured bias (DSE error that exaggerates the 
undercount). To get at the "best estimate” of overall undercount that can be inferred from the 
CAPE analysis, the "measured bias" must be subtracted out and the unmeasured undercount (the 
missing piece represented by the estimated correlation bias) must be added back in. The result is 
l .58 - 0.73 + 0.38 = 1 .23 or about 1 .2%. Each of these terms has been revised (and they are still 
being revised) so this estimate is certainly not precise to hundredths of a percent, in fact, even the 
tenths position is soft. Thus the actual similarity between this 1.2% (plus or minus something) 
and the 1 .2% (also plus or minus something) estimated for the 1 980 Census is surely 
coincidence. I attended one of the near riots that followed release of the 1980 figures (with the 
huge "closure error") -- acceptance of the 1990 figures was relatively sedate by comparison. 
Attempts to draw fine comparisons based on numbers ten years apart from evolving 
methodologies in the face of unmeasured uncertainties is, at best, naive. 

In terms of quality control, closure error (see Maloney question #21), and the DSE measured 
undercount (0.85%), 1990 appears better than 1980, But if one selects other measures, 1980 
looks better. When one considers the size of the uncertainties and the fact that the variance 



178 


(error) of the difference between two (independent) estimates is the SUM of variances (errors) in 
the two components, one would be hard pressed to find any significant difference in overall 
quality between 1980 and 1990. 


3. From your experience as a professional statistician, to what extent are errors 
experienced in an actual enumeration likely to appear in a sample as well? 

A. The errors that occur in an enumeration are what statisticians call "nonsampling errors." The 
introduction of sampling per se has no effect on these errors from sources other than sampling, it 
adds a new source of error (sampling error) to those sources that already exist. If no other 
changes are made, then the total error (consisting of both sampling and nonsampling error) 
increases. There is, however, an exception to this rule — when the sample is small enough to 
permit use of improved enumeration methods that directly reduce these other errors. 

The ASA Blue Ribbon Panel on the Census discussed this in its technical attachment to the 1996 
Report — 

"The achievement of greater accuracy depends on how much more accurate the refined 
enumeration procedure is than the standard procedure and how much greater is its cost 
These factors need to be evaluated for each specific case to determine the comparative 
accuracy of census and sample results. It should also be noted that more refined 
enumeration methods can sometimes only be employed with smaller-scale, (i.e., sample), 
studies. Reasons for this include the need for highly trained enumerators who are 
available only in limited numbers and the use of burdensome questionnaires that can be 
employed only with a small sample." 

The samples proposed for following up nonresponse are very large (not much smaller than the 
total nonresponding population that would be treated in full follow-up). This requires the same 
enumeration procedures (with the same contribution of nonsampling error) in the sample as 
would apply to the full follow-up alternative. If refinements that would reduce nonsampling 
error (the kind described by the Blue Ribbon Panel) were feasible on this large scale, then the 
cost of extending such refinements to full follow-up would be modest, (see also Maloney #24) 

4. In response to a question, you indicated that you had seen a lot of "bad" sampling. Can 
you elaborate? How do you determine when sampling is appropriate or effective? 

A. Probably the most blatant case was the use of sampling in furtherance of tax fraud. It is an 
interesting story, but (fortunately) relatively rare. A much more widespread abuse is the use of 
methods that produce wide error bands so that preferred interpretations are not excluded as 
inconsistent with the data. Another variation on this theme is the use of methods that are not 
robust so that different combinations of assumptions can be tested by the policy shop until some 
preferred result is produced. Statisticians know these risks better than most, but too often they 
salve their conscience with "caveats" and let the advocates do the damage. 


2 



179 


Some of the red flags are: 

1) "new" methods that have not been evaluated, 

2) the choice of methods with large variance or bias when tighter methods would make 
the implicit assumptions more visible, and 

3) use of non-robust methods (that are sensitive to assumptions) when more robust 
alternatives are available. 

OMB frequently requires a SUCCESSFUL evaluation before approving such methods, but this 
doesn't always work. 


5. In response to a question you implied a relationship between errors in measuring 
undercount and strategies for reducing the differential undercount. How does the former 
affect the latter? 

A. This is discussed in detail in my response to Maloney #6 and #19. 


3 





181 


5. The 1990 census cost 20 percent more per household in real dollars than the 1980 
census. The 1980 census cost twice as much per household in real dollars as the 
1970 census. That is an increase in real dollar cost per household of 250 percent 
with no improvement in the differential undercount. Does that suggest to you that 
spending more on traditional methods will reduce the differential undercount? 

6. Demographic analysis showed higher undercounts of African Americans than the 
undercounts demonstrated by the Post Enumeration Survey. That suggests that 
the Post Enumeration Survey understates, not overstates, the undercount, 
especially for minorities. In other words, isn’t it likely that the 1990 census 
missed more African-Americans that would have been added back into the census 
by the Post Enumeration Survey? 

7. You have talked a lot about bias in the Post Enumeration Survey but have not 
talked much about the bias in the census. The differential undercount measured 
by demographic analysis shows that bias in the census is quite real. If there is no 
Integrated Coverage Measurement, is it not the case that this bias in the census 
will continue? 

8. Do you believe that it is acceptable for the census to consistently miss certain 
segments of the population -- Africans Americans, Latinos, Asian Americans, 
poor people in rural and urban communities -- at greater rates than the White 
population? If that is not acceptable, what do you propose be done to reduce the 
differential undercount? Can you offer any evidence that you proposal(s) will 
reduce the differential undercount? 

9. It has been stated that one of the faults of the 1990 PES was correlation bias. Can 
you explain correlation bias? I understand that it is the likelihood that the people 
missed in the census may be the same people missed in the PES. Said another 
way, both the census and the survey miss the same people, for example, young 
Black males. How does correlation bias affect the accuracy count of those 
traditionally undercounted. Blacks, Hispanics, Asians, Native Americans, renters? 

10. Wouldn’t the only risk of correlation bias be minimization of the undercount 
rather than an overestimation the undercount? 

11. In testimony before the Senate Committee on Governmental Affairs 
approximately one year ago, Dr. Lawrence Brown, Professor of Statistics at the 
University of Pennsylvania, stated that, “Statistical sampling methods can be used 
in an effective and objective way to assist the census process.” Do you agree with 
Dr. Brown’s statement? If you disagree, please explain why. 

12. Dr. Lawrence Brown also testified before Senator Thompson that the Sampling 
for Nonresponse Follow-up plan “is an objective procedure all the way around 



182 


[and] has a very good chance of working as desired.” Do you agree with that 
statement? If you disagree, please explain why. 

13. In addition. Dr. Brown testified that the Census Bureau’s 2000 census plan had 
been “drastically simplified and improved. ...[these changes] make it possible to 
now believe that the Integrated Coverage Measurement might work as well as 
desired to correct the undercount.” Do you agree with that statement? If you 
disagree, please explain why. 

14. With regard to concerns that the Integrated Coverage Measurement process could 
be manipulated to achieve a particular outcome in terms of the population counts. 
Dr. Brown testified that, “if all of this planning is done in advance, it is very, very 
hard for me to see how one could direct these subjective decisions towards any 
desired goal.” Do you agree with Dr. Brown that if the procedures and protocols 
for the Integrated Coverage Measurement are set forth in advance and subject to 
expert and public scrutiny, that it is very unlikely that the sampling and statistical 
estimation process will be subject to manipulation, possibly for political 
advantage? If you disagree, please explain why. 

15. Dr. Brown also testified that even after the non-response follow-up phase of the 
census is complete, there “would still [be] the undercount problem of those people 
who just refuse to be counted or are very difficult to count.” Do you agree with 
that statement? If you disagree, please explain why. 

16. With regard to the post-enumeration survey in the 1990 census. Dr. Brown 
testified that many of the difficulties with the procedure “can be traced to the fact 
that the PES sample was much too small to support the kind of objective, reliable 
analyses that are desired.” Do you agree with that? If you disagree, please 
explain why. 

1 7. The size of the sample in the Integrated Coverage Management (1CM) is 750,000 
households. Is that a proper size for such an endeavor? 

18. The results of the PES in 1990 showed that census was less accurate than its 
predecessor. That result was confirmed by demographic analysis, which has been 
performed on every census since 1 940. We certainly know that the 1 990 census 
was much more expensive than the 1980 census. Do you agree with the 
conclusion that 1990 was less also less accurate than 1980? 

19. Please explain the difference between net over- or undercount in the 1990 census 
count and actual over- and undercounts (mistakes) made int he 1 990 count. I 
know that a net undercount of 1 .6% sounds relatively small but for census 
purposes, aren’t those 26 million mistakes a concern? 



183 


20. I understand that improvement in the average does not necessarily mean that there 
will be improvement in every case. In 1990, there was criticism about the strata 
being broken down by region. If statistical methods are used in 2000, with strata 
broken down by state in 2000, can we expect more states with improved accuracy 
than there were in 1990? 

21 . Representative Sawyer pointed out that the longer the Census Bureau is in the 
field, the higher the error rate in the information collected. 1 believe that 
information came from one of the many GAO studies he and his Republican 
colleagues commissioned. You have stated your concern about the Census 
Bureau not be in the field for enough days in the 2000 plan. Can you explain the 
difference in opinion? 

22. In order to address the problem of declining public response, the GAO suggested 
exploring a radically streamlined questionnaire in future censuses. Would you 
give us your thoughts on how effective this approach might be in increasing 
response, and also its effect on perhaps diminishing the usefulness of census data? 

23. In its 1 992 capping report on the 1990 census, the GAO concluded that “the 
results and experiences of the 1990 census demonstrate that the American public 
has grown too diverse and dynamic to be accurately counted solely by the 
tradition ‘headcount’ approach and that fundamental changes must be 
implemented for a successful census in 2000." Do you agree with that 
conclusion? If you disagree, please explain why. 

24. After the 1 990 census, GAO concluded that “the amount of error in the census 
increases precipitously as time and effort are extended to count the last few 
percentages of the population... .This increase in the rate of error shows that 
extended reliance on field follow-up activities represents a losing trade-off 
between augmenting the count and adding more errors.” In the last months of the 
follow-up efforts in 1990, GAO estimated that the error rates approached 30 
percent, and that this problem was probably exacerbated by the use of close-out 
procedures. This appears to be a problem inherent to the methodology of the 

1 990 census. Don’t you agree? 

Do you have any information on the error rates for information gathered using 
close-out procedures? 

Even if sampling is not perfect, isn’t its error rate well below the levels for the last 
percentages of the population using more traditional follow-up procedures? 

If this is the case, then doesn’t that logically lead to GAO’s and the Commerce 
Department’s Inspector General’s conclusion that sampling at least a portion of 
the nonresponding households would increase the accuracy and decrease the cost 
of conducting the census? 



184 


25. GAO also concluded after the 1990 census that a high level of public cooperation 
is key to obtaining an accurate census at reasonable cost. Unfortunately the mail 
response rate has fallen with every census since 1 970, and was only 
approximately 65 percent in 1990. The reasons for this decline are in many 
instances outside of the Census Bureaus control, for example the increase in 
commercial mail and telephone solicitations and in nontraditional household 
arrangements. For these reasons, the Bureau is planning a public education 
campaign for the 2000 census, surpassing any previous attempts. Given the 
response in 1 990, do you believe this is money well-spent? 

Do you believe that this public education campaign can succeed in arresting the 
decline in response rates? 

Even if it does, wouldn’t some use of sampling be warranted to solve the 
problems associated with reaching the last few percentages of nonresponding 
households? 

My questions and your answers will be part of the permanent record of the May 5, 1998, hearing. 

Again, thank you for your impute into this most important process. 


Sincerely, 



Subcommittee on the Census 


cc: The Honorable Dan Miller 



185 


STATE OF MICHIGAN 



JOHN ENGLER, Governor 


DEPARTMENT OF MANAGEMENT & BUDGET 


P.O BOX 30026, LANSING, MICHIGAN 48909 
MARY A. LANNOYE, State Budget Director 


June 19, 1998 


Office of the State Demographer 
Michigan Information Center 


Honorable Carolyn B. Maloney 
Ranking Minority Member 
Subcommittee on the Census 
U.S. House of Representatives 

Dear Representative Maloney: 

Thank you for the twenty-five questions which you included in your memo to me on 
May 13, 1998. These questions raise a wide range of important issues, and I hope 
that the following responses will lead to a dialog which advances the debate on the 
techniques proposed for use in the next census. 

1. Can you tell us about a statistical or scientific activity that you’ve worked 
on that either worked perfectly the first time you tried it, or that didn’t 
work as well as you had hoped the first time so you abandoned the idea 
altogether without making an effort to improve or redesign it? 

Very early in my career, I had an experience with this dilemma which I believe 
can shed a great deal of light on the process of computing adjustments for 
census undercount. 


In the first government agency I worked for, I was once asked to do a quick 
analysis to show the cost of excess hospital capacity in Michigan. I had a 
pretty good idea what to expect based on the published literature on the subject, 
but my first calculations showed just the opposite of what I expected. 

Naturally, the question I asked myself was “What did I do wrong? ” When I 
reviewed my computer program with this question in mind, I found a simple 
computational error that explained a large part of the problem. The figures still 
didn’t point in the expected direction, but at least they didn’t point so strongly 
in the “wrong” direction. 


- 1 - 



186 


I couldn’t find any more mistakes in my program, so the next question I asked 
myself was "How can / improve the analysis? ” Since I had been taking a very 
simple approach to a very complex question, it didn’t take long to find that 1 
had left out some important factors which biased the results in the “wrong” 
direction. When I repeated the calculations with allowances for those factors, I 
got the results that I expected. 

Unquestionably, the changes which I made were improvements. I had 
produced an analysis that was consistent with my expectations about what was 
true and with the published literature on the subject. But that experience left 
me with two important questions: 

( 1 .) What would have happened if my initial results had been consistent with 
my expectations? Would I even have found my computational error if I 
hadn’t had to ask myself “What did 1 do wrong?” 

(2.) What would have happened if my expectations and the initial results had 
been the opposite of what actually happened? What if I had expected 
excess hospital capacity to decrease hospital expenditures instead of 
increasing them, and what if my first calculations had shown an increase? 
Would I have been able to find some legitimate factors that were left out 
of the initial analysis which biased the results in this new direction? 
Would I have “improved” the analysis in the opposite direction if 1 had 
the opposite expectations? 

I had encountered a dilemma which faces all researchers, whether they are 
aware of it or not: 

On the one hand, it is probably impossible to produce good research on a 
complicated problem without finding and correcting mistakes and 
modifying methods based on new insights that are gained in the course of 
the analysis. And a principal way to find those mistakes and gain those 
new insights is by finding things that are contrary to expectations and 
figuring out either what went wrong or how the data and the analysis can 
be improved. 

On the other hand, when the corrections and refinements are driven by 
expectations of what the results should be, the research will tend to 
conform to those expectations regardless of whether those expectations 
are correct and regardless of whether the data and methodology are sound. 


- 2 - 



187 


I believe that this personal experience and this dilemma shed a lot of light on 
the process of measuring undercount through a post-enumeration survey. In 
one respect, the analysis of the post-enumeration survey is exactly the opposite 
of the analysis described above: instead of being too simple, it is incredibly 
complex. Yet it illustrates the dilemma of expectation-driven analysis even 
better than my personal experience: Matching survey responses with census 
responses is so difficult and it involves so many errors of so many types that it 
sets up an impossible dilemma for the Census Bureau. On the one hand, it is 
necessary to monitor the quality of processes to ensure that they are producing 
plausible results, to check outliers and disparities, to look for problems, and to 
correct problems when they are found. On the other hand, those necessary 
measures tend to make the results conform to expectations, irrespective of the 
correctness of the expectations or the soundness of the underlying data and 
methodology. 

Some of the corrections that were made had a very large impact on the final 
adjustments for undercount. For example, when certain blocks seemed to have 
too much undercount, records were sent for re-matching and they came back 
with different results: re-matching just 104 out of 5,290 block clusters resulted 
in a decrease of 250,000 in the estimated net national undercount. When other 
blocks had obvious problems due to geocoding errors, they were 
“downweighted” so they would have less impact: downweighting just 2 block 
clusters reduced their impact on the national net undercount from nearly 1 
million persons to only about 150,000 persons. A computer programming error 
was found which contributed over 1 million persons to the net national 
undercount. Without these three corrections, the final estimate of net 
undercount would have been about 40% higher than it was, and it would not 
have been plausible even at the broadest national level. On the one hand, it 
would be difficult to argue that these corrections should not have been made. 
On the other hand, it is clear that there were enough remaining errors that any 
of the adjustment factors could still have been “corrected” significantly in 
either direction. [For further discussion of the difficulties of matching surveys 
and the high level of error in the undercount analysis, please see page 9 of the 
first paper' and pages 3 through 1 3 of the second paper? which I submitted to 
the Subcommittee on 5/5/98.] 

One of the paradoxes of the PES analysis is that it produced a seemingly 
plausible picture of undercount at the broadest national level despite its many 
obvious flaws. However, once the potential role of expectations in refining the 
data is understood, this is not surprising at all. Given enough time, resources, 
and methodological flexibility, the adjustment factors could probably be 


- 3 - 



188 


corrected until they produced virtually any pattern of undercount that is 
deemed plausible. 

2. Despite the fact that the Census Bureau made improving the count among 
minorities a major goal of the 1990 Census, the 4.4 percent differential in 
the 1990 undercount between Blacks and non-Blacks was the highest ever 
recorded. Experts have repeatedly said that spending more money on 
traditional methods will not reduce this differential. If not through 
statistics, how do you propose to reduce this differential? 

First, I would like to comment on the observation that the differential 
undercount in 1990 was the highest ever recorded. It is true that the difference 
between the estimated undercount for blacks and the estimated undercount for 
other races increased from 4.3 percentage points in 1970 to 4.4 percentage 
points in 1990. However, it would be a mistake to suppose that the undercount 
has been getting worse in each census. In fact, according to the Census 
Bureau’s “demographic analysis” method, the undercount for blacks in 1990 
was the second lowest ever recorded. Likewise, the 1990 undercount for 
whites was the second lowest ever recorded and the overall undercount was the 
second lowest ever recorded. The lowest undercounts ever recorded were in 
1980. 

Thus, the last two censuses have been our most accurate in history with respect 
to undercount. Although there is certainly room for improvement, it is evident 
that the Census Bureau’s efforts to improve the count have met with 
considerable success. The widespread discouragement and negativism with 
regard to so-called “traditional methods” is unwarranted. 

(A chart showing the estimated undercount rates for each census since 1940 
appears in Figure 1 of my first paper.i See also the answer to Question 5 
below.) 

My suggestions for reducing undercount and reducing the undercount 
differential fall into two general categories: (1 .) improving the census 
enumeration, and (2.) estimating the amount of undercount for those 
demographic groups and levels of geography for which reliable estimates can 
be made instead of adjusting for undercount. 

1 . Improving the census enumeration. Most of the following suggestions for 

improving the count are not original, and they can be considered 


- 4 - 



189 


'‘traditional methods.” like those that have made the last two censuses the 

most successful in our history: 

(a.) The Master Address File (MAF) is a key to the success of the census. 
The Local Review Program and other efforts to improve the MAF 
should receive all the resources and attention that they need to 
succeed. 

(b.) Another key to the success of the census is the number and quality of 
enumerators. One reason for the success of the 19B0 Census may 
have been the large number of recent college graduates who were 
unemployed and available to work for the Census Bureau. With the 
aging of the Baby Boom generation, such a pool of labor was not 
available for the 1990 Census. Due to a relatively small number of 
young people and the possibility of a continued sound economy, 
recruitment of skilled temporary workers for Census 2000 may be 
very difficult. Meeting this challenge needs to be a high priority. 

(c.) Yet another key to the success of the census is adequate time in 
which to conduct follow-up. If Integrated Coverage Measurement 
is not implemented, some of the time currently allotted to the 
coverage survey could be used for regular census operations. 

(d.) Since many households have more than five members, the standard 
census form should have room for information on more than five 
people. 

(e.) An effort should be made to ensure that every household receives all 
the census forms that it needs before Census Day. 

The proposed use of pre-census reminder cards is a promising 
innovation. The Bureau could consider the possibility of including 
return-cards that households can use to request foreign-language 
forms, extra forms for additional household members, and any other 
special forms and assistance that the household might need. 

(f.) Some households include members who may want to keep their 
census information confidential from other members of the 
household (or from whom the rest of the household may want to 
keep their census information confidential). There could be 
provisions for them to receive and submit separate census forms. 


- 5 - 



190 


(g.) The traditional “substitution” process for non-respondents and partial 
respondents could be modified so that the mix of respondents in the 
“deck” from which substitutions are made reflects the characteristics 
of non-respondents and partial respondents, rather than reflecting the 
characteristics of the population as a whole. This should reduce the 
undercount differential. 

Many other good ideas for improving the enumeration have been 
suggested by other analysts, and many have already been adopted by the 
Census Bureau. 

2. Estimating undercount instead of adjusting for undercount. Even after 
every effort to achieve the best possible count, there will be some 
segments of the population that have not been fully counted. This 
problem can be addressed more appropriately through estimates of 
undercount than through adjustments for undercount. The advantages of 
approaching undercount in this manner include the following: 

(a.) An estimate of undercount would not have to be released until it is 
completed and evaluated. An adjustment for undercount would have 
to be finalized very quickly to meet the statutory deadlines for 
completion of the census. 

(b.) An estimate of undercount could be revised as more is learned about 
patterns of undercount in the census. An adjustment for undercount 
could not be changed even after it is found to be faulty, since it 
would be the official census count and since it would be reflected in 
hundreds of census products that would not be feasible to replace. 

(c.) An estimate of undercount could use all relevant sources of valid 
information. The proposed method of adjusting for undercount is 
limited to one source of information — a post-enumeration survey — 
which misses many of the same people who are missed by the census 
and identifies many people as missed by the census who were not 
missed at all. 

(d.) An estimate of undercount could be developed for only those levels 
of geography for which it is reliable. For example, if a methodology 
works well at the state and national levels but not at the local level, 
undercount estimates would not have to be made at the local level. In 


- 6 - 



191 


contrast, the proposed adjustment for undercount would he applied all 
the way down to the block level. 

[See also the answer to Question 8 below.] 

3. You have mentioned your concerns about block level accuracy. Can you 
discuss your thoughts on the accuracy of census numbers at the state level 
if Dual System estimation is used in 2000? Do you have any evidence that 
suggests that the census counts will be more accurate at the state level in 
2000 if DSE is not used? 

The central flaws of the proposed method of adjusting for undercount, which 
are explained in the papers that I submitted to the Subcommittee on 5/5/98, 1.2 
are (a.) that it misses many of the same people who are missed by the census, 
and (b.) that many — in fact, most — of the people that it identifies as missed by 
the census were not missed at all. Thus, any differences it suggests between 
states are not so much differences in the amount of undercount as they are 
differences in the amount of error that the Census Bureau makes in trying to 
measure undercount. 

Several of the sources of bias noted in my testimony are of particular relevance 
at the state level. For example: 

• The exclusion of homeless people from the post-enumeration survey results 
in a bias against states whose homeless people are more likely to be staying 
with households during the April census than during the subsequent post- 
enumeration survey. 

• Differences in weather and climate can affect the level of fabrication in the 
post-enumeration survey, which in turn can have a very serious impact on 
the apparent undercount rate. 

• Because differences in weather and climate influence the likelihood that 
people will be at home when an enumerator visits, they can affect the 
proportion of successful PES interviews in different states. A high rate of 
unsuccessful interviews or proxy interviews in the PES can seriously 
increase the level of error in measuring undercount. 

• When people migrate from one state to another on a seasonal basis, the 
post-enumeration survey can assign them to a different state from the one 


- 7 - 



192 


they reported as their “usual” state of residence when they filled out their 
census form. 

[See also the answer to questions 4 and 20 below.] 

4. Secretary Mosbacher, in testimony before both the House and the Senate, 
said that the Post-Enumeration Survey would make the majority of the 
states more accurate. Is that statement correct? If so, why is his testimony 
so at odds with your testimony? 

In the “Notice of Final Decision” regarding adjustment of the 1990 Census, 
Secretary Mosbacher wrote: 

Based on the measurements so far completed, the Census Bureau 
estimated that the proportional share of about 29 states would be made 
more accurate and about 2 1 states would be made less accurate by 
adjustment... When the Census Bureau made allowances for plausible 
estimates of factors not yet measured, these comparisons shifted toward 
favoring the accuracy of the census enumeration. Using this test, 28 or 29 
states were estimated to be made less accurate if the adjustment were to be 
used... While we know that some will fare better and some will fare 
worse under an adjustment, we don’t really know how much better or how 
much worse. If the scientists cannot agree on these issues, how can we 
expect the losing cities and states as well as the American public to accept 
this change? [Congressional Record, 7/22/91, page 33583] 

This statement by Secretary Mosbacher is not at odds with my testimony. The 
figures cited, which involve comparing the adjusted counts to calculations 
based on assumptions about actual undercount in each state, are consistent with 
everything I have said about high levels of error in the Post-Enumeration 
Survey. An adjustment methodology that seemed to be less accurate than the 
census for 21 or 28 or 29 states in 1990 can hardly be considered a sound basis 
for fine-tuning the results of the next census. 

[See also the answers to Question 3 above and Question 20 below.] 

5. The 1990 census cost 20 percent more per household in real dollars than 
the 1980 census. The 1980 census cost twice as much per household in real 
dollars as the 1970 census. That is an increase in real dollar cost per 
household of 250 percent with no improvement in the differential 
undercount. Does that suggest to you that spending more on traditional 
methods will reduce the differential undercount? 


- 8 - 



193 


In addressing this question, it is important to remember that the 1980 and 1990 
censuses were the most successful in history with respect to minimizing 
undercount. Based on the Census Bureau’s “demographic analysis” method, 
the 1.8% estimated undercount in 1990 compares favorably to the estimated 
undercounts for 1940 (5.4%) through 1970 (2.7%). Likewise, the estimated 
undercount for blacks in 1990 (5.7%) compares favorably to the estimated 
undercounts for blacks for 1940 (8.4%) through 1970 (6.5%). The estimated 
1990 undercounts for blacks, for other races, and for the population as a whole 
are the second best ever recorded; the only census with better results was the 
1980 Census. (See Figure 1 of my first paper.') 

My assessment of these figures is that the Census Bureau has made a lot of 
progress through the so-called “traditional methods.” Since a number of 
promising improvements have been incorporated in the plans for Census 2000 
and further improvements remain to be explored, it appears that the “traditional 
methods” hold promise for further progress. [See also the answer to Question 2 
above.] 

6. Demographic analysis showed higher undercounts of African Americans 
than the undercounts demonstrated by the Post Enumeration Survey. 

That suggests that the Post Enumeration Survey understates, not 
overstates, the undercount, especially for minorities. In other words, isn’t 
it likely that the 1990 census missed more African-Americans than would 
have been added back into the census by the Post Enumeration Survey? 

As you note, there are substantial discrepancies between the undercounts 
suggested by the post-enumeration survey and those suggested by demographic 
analysis. These discrepancies can be seen in Figure 2 of my first paper': 
Relative to the results of demographic analysis, the undercount adjustments 
that were proposed for the 1990 census were 36% too low for black males but 
43% too high for black females at the national level. The adjustments for other 
males were about right at the national level, but the adjustments for other 
females were 133% too high. Subsequent to correction of several errors, the 
adjustments proposed in September 1992 were 42% too low for black males 
and 33% too high for black females at the national level. The adjustments for 
other males were 25% too low, and the adjustments for other females were 
50% too high. The situation was even worse at the regional level, where the 
proposed adjustments presented an inconsistent mosaic of high and low 
adjustments for different age, race, and sex categories. 


- 9 - 



194 


The birth data and other data used in demographic analysis provide a very solid 
basis for estimating the relative number of males and females that were missed 
by the census. The discrepancies between the PES and demographic analysis 
therefore demonstrate quite clearly that the undercount adjustments derived 
from the PES are implausible and unreliable. However, one obviously cannot 
go beyond that to characterize them as consistently overstating or understating 
the undercount of minorities. 

7. You have talked a lot about bias in the Post Enumeration Survey but have 
not talked much about the bias in the census. The differential undercount 
measured by demographic analysis shows that bias in the census is quite 
real. If there is no Integrated Coverage Measurement, is it not the case 
that this bias in the census will continue? 

The various techniques for conducting a more accurate enumeration — 
including those listed in my response to Question 2 above, those discussed in 
reports by the National Academy of Sciences, those proposed by the Census 
Bureau, and others as well — can be expected to promote a modest 
improvement in undercount rates. As explained in my response to Question 2 
above, 1 believe that the remaining undercount is best addressed through 
population estimates rather than through census adjustments. 

8. Do you believe that it is acceptable for the census to consistently miss 
certain segments of the population — African Americans, Latinos, Asian 
Americans, poor people in rural and urban communities — at greater rates 
than the white population? If that is not acceptable, what do you propose 
be done to reduce the differential undercount? Can you offer any evidence 
that your proposal(s) will reduce the differential undercount? 

Although the Census Bureau tries very hard to count everybody and makes 
special efforts to count minorities and persons in poor communities, there are 
still some people who are missed. Regardless of whether they are missed 
because their living arrangements make them hard to count or because they 
intentionally avoid the census, it is desirable to know how many people each 
community really has and what their characteristics are. 

However, the methodology that has been proposed for adjusting the census is 
not acceptable: it reflects survey matching error more than it reflects 
undercount, it would greatly reduce the value of sub-national census data, it 
would invalidate comparisons over time, and it would not be demographically 
credible even at the national level. 


- 10 - 



I do not know of any methodology that can produce acceptable adjustments for 
undercount. Such a methodology would have to meet several difficult criteria. 
Some of the criteria that come to mind are: 

(a.) It would have to reflect undercount, and not some other phenomenon that 
is distributed differently from undercount. 

(b.) It would have to be simple enough to be completed and verified within the 
tight statutory time frame for producing the census count. 

(c.) It would have to be sound enough to be recognized as valid and to need no 
major corrections or revisions after the census count is published. 

(d.) The level of sampling error and other errors would need to be small 
enough that they wouldn’t affect analysis of local census data more 
seriously than undercount itself. 

(e.) Variations in error over time would need to be small enough that they 
would not invalidate comparisons of detailed census data over time. 

The proposed adjustment methodology does not meet any of these criteria, and 
I know of no alternative adjustment methodology that meets them all. 

As indicated in the answer to Question 2 above, the problem of undercount can 
be addressed by (a.) conducting a more complete count, and (b.) developing 
estimates of undercount instead of adjustments for undercount. A properly 
designed estimate could meet the first and last criteria, and the remaining 
criteria would be inapplicable or relaxed. An estimate would be subject to 
review and revision, it would not have to be subject to a tight statutory time 
frame, and it would not have to be applied to small units of geography unless it 
was found to be valid for small units of geography. 

It has been stated that one of the faults of the 1990 PES was correlation 
bias. Can you explain correlation bias? I understand that it is the 
likelihood that the people missed in the census may be the same people 
missed in the PES. Said another way, both the census and the survey miss 
the same people, for example, young Black males. How does correlation 
bias affect the accuracy count of those traditionally undercounted. Blacks, 
Hispanics, Native Americans, renters? 



1 % 


Your understanding of correlation bias is correct. Correlation bias should lead 
to a very substantial underestimate of the undercount for those groups which 
tend to be missed by both surveys. 

10. Wouldn’t the only risk of correlation bias be minimization of the 
undercount rather than overestimation of the undercount? 

That is only one of the risks. Another problem is that some communities might 
have more correlation bias than others. This is one of several factors that can 
cause the adjusted counts to be less indicative of a community’s share of the 
nation’s population than the original counts. 

Another problem with correlation bias is that analysts who dismiss it as 
innocuous sometimes seem to forget that it is there. Correlation bias should 
result in adjustments for undercount that are much too low. However, the 
undercount adjustments derived from the 1990 Post-Enumeration Survey were 
not much too low: they were much to high for some segments of the 
population, much to low for others, and about on target for the national 
population as a whole. Analysts who forget about correlation bias and focus 
only on the seemingly plausible picture of undercount for the national 
population as a whole can make the mistake of thinking that the PES provides 
reasonably accurate information about undercount. However, for analysts who 
do not forget about correlation bias, the fact that the adjustments derived from 
the PES are not consistently too low is a clear sign that there is something 
seriously wrong with them. 

11. In testimony before the Senate Committee on Governmental Affairs 
approximately one year ago, Dr. Lawrence Brown, Professor of Statistics 
at the University of Pennsylvania, stated that “Statistical sampling 
methods can be used in an effective and objective way to assist the census 
process.” Do you agree with Dr. Brown’s statement? If you disagree, 
please explain why. 

While I do not disagree with this statement, 1 would add that statistical 
sampling methods can be used in ways that are effective and ways that are 
ineffective, in ways that are objective and ways that are biased, and in ways 
that assist and ways that detract from the census process. Like any tools, 
statistical sampling methods work better for some purposes than for others, and 
they can be used in both appropriate and inappropriate ways. 


- 12 - 



197 


12. Dr. Lawrence Brown also testified before Senator Thompson that the 
Sampling for Non-Response Follow-up plan “is an objective procedure all 
the way around and has a very good chance of working as desired.” Do 
you agree with that statement? If you disagree, please explain why. 

My testimony and analysis have focused exclusively on the issue of undercount 
adjustment, and I have not comprehensively reviewed the methodology 
proposed for handling non-response. Nevertheless, the following observations 
should be helpful for understanding some of its shortcomings. 

An underlying premise of sampling for non-response is that each census 
statistic will be based mostly on actual responses, and that it will therefore not 
be seriously affected by minor errors in estimating the characteristics of the 
remaining 10% or so of the population from a sample. 

One critical statistic for which this premise does not hold is the vacancy rate. 
Obviously, most vacant households will not respond to the census. It is my 
understanding that most of them are to be excluded from follow-up based on 
reports by letter carriers that they are vacant. (The plan calls for a sample of 
these housing units to be followed-up, however, in order to adjust for 
inaccuracies in the letter carriers’ vacancy reports.) Any vacant units that the 
letter carriers do not report as vacant are to be followed up on a sample basis 
along with other non-responding households. Unfortunately, neither the letter 
carrier reports nor the proposed samples will produce reliable vacancy data. 

The letter carrier reports tend to be inaccurate, their errors cannot be corrected 
very well through the proposed sample, and the routine sampling of non- 
responding housing units will be subject to error as well. 

In its preliminary testing, the Census Bureau found that 42% of the housing 
units that letter carriers identified as vacant were actually occupied, and that 
half of the units pre-identified as vacant were not identified as such by the letter 
carriers. If this result is at all indicative of the level of error to be expected in 
the letter carrier reports, they provide a very poor basis for determining 
vacancy status. 

These deficiencies of the letter carrier reports cannot be corrected adequately 
even through the 30% sample recently proposed. Variations in the accuracy of 
letter carrier reports from neighborhood to neighborhood and from carrier to 
carrier will present a serious dilemma: If the correction factors are derived 
from broad geographic areas, they will not be applicable to neighborhoods 
where vacancy status is particularly easy or particularly hard to determine, nor 


- 13 - 



198 


to neighborhoods where the letter carrier has particularly high or particularly 
low levels of skill and conscientiousness in determining vacancy status. But if 
they are derived from small geographic areas, they will tend to be dominated 
by sampling error. Whichever way the Census Bureau chooses to resolve this 
dilemma, the correction factors will be unreliable for small units of geography. 
The poor overall quality of the letter carrier reports, in turn, will cause those 
unreliable correction factors to have a very large impact on the vacancy rates. 

A similar dilemma arises in connection with vacant units in the “regular” 
sample of non-responding households. The number of vacant units missed by 
the letter carriers can be expected to vary widely from neighborhood to 
neighborhood: Data derived from broad geographic areas will therefore not be 
indicative of local conditions, but data derived from small geographic areas 
will tend to be dominated by sampling error. Finding even one vacant housing 
unit in the sample can cause several housing units to be considered vacant, 
which can substantially change the vacancy picture for a census block or a 
small community. Any error — whether sampling error or non-sampling 
error — will therefore tend to have a serious impact. And since we are talking 
about measuring a (usually) small proportion of households through a small 
sample drawn from a small population, relatively high levels of error can be 
expected. 

These problems would be much less serious if 90% of the data on vacancy 
were based on actual enumerations and only 10% of the data were subject to 
substantial error. However, that will not be the case due to the fact that most 
vacant housing units do not respond to the census. Unlike most other census 
statistics, the numerator of the vacancy rate is to be almost entirely based on 
very imprecise data. 

A problem with faulty vacancy rates is far more critical than it may seem at 
first glance. In addition to being an important statistic in its own right, the 
vacancy rate plays a crucial role in determining the census count itself. If the 
estimated vacancy rate for a unit of government is 2 percentage points too low, 
then people will be imputed as living in vacant housing units and we can expect 
the population count to be a little more than 2 percentage points too high. If 
the estimated vacancy rate is 2 percentage points too high, then housing units 
that are occupied will be assumed to be vacant and we can expect the 
population count to be a little more than 2 percentage points too low. 

Errors of this magnitude and greater would be quite likely for many units of 
government, particularly where there is a substantial amount of seasonal or 


- 14 - 



199 


vacation housing. For example, 49% of the units of government in Michigan 
had vacancy rates of 10% or more in 1990, 31% had vacancy rates of 25% or 
more, and 14% had vacancy rates of 50% or more. The proportion of housing 
units in these areas whose vacancy status would be determined by very 
imprecise methods would therefore be quite substantial, and the resulting 
census “counts” could easily be off by several percentage points. 

As a demographer involved in the production of intercensal population 
estimates, I am very much aware of the weaknesses and limitations of those 
estimates and of the need for periodically benchmarking them to new census 
counts. I am therefore alarmed by the prospect that the proposed methodology 
might produce census “counts” for many units of government that are less 
reliable than their intercensal population estimates based on the 1990 Census, 
and that future population estimates for these areas might have no accurate 
basis at all. 


Another potential problem with sampling for non-response is the possibility of 
distortions in local population data caused by replicating cases encountered in 
the sample. For example, if the methodology turns one household with a 
grandmother caring for grandchildren into several local households with 
grandmothers caring for grandchildren, or one household with twelve children 
into several local households with twelve children apiece, then the local census 
data will be seriously distorted. Thus, it would not be appropriate to replicate 
the findings from the sample within a small geographic area. (It may be 
appropriate, however, to use large-area samples as a basis for assigning weights 
to local census responses in order to influence the composition of the “deck” 
used for imputing the characteristics of non-responding households. See item 
l(g.) under Question 2 above.) [Problematic aspects of sampling for non- 
response are discussed further in Question 24 (c.) below.J 

13. In addition, Dr. Brown testified that the Census Bureau’s 2000 Census 
plan had been “drastically simplified and improved. ...|these changes] 
make it possible to now believe that the Integrated Coverage Measurement 
might work as well as desired to correct the undercount.” Do you agree 
with that statement. If you disagree, please explain why. 

I strongly disagree with this statement. The two papers which I submitted as 

testimony to the Subcommittee on 5/5/98 are entirely directed toward 
explaining my position on this question.'. I 2 


- 15 - 



200 


14. With regard to concerns that the Integrated Coverage Measurement 
process could be manipulated to achieve a particular outcome in terms of 
the population counts, Or. Brown testified that, “if all of this planning is 
done in advance, it is very, very hard for me to see how one could direct 
these subjective decisions towards any desired goal.” Do you agree with 
Dr. Brown that if the procedures and protocols for the Integrated 
Coverage Measurement are set forth in advance and subject to expert and 
public scrutiny, that it is very unlikely that the sampling and statistical 
estimation process will be subject to manipulation, possibly for political 
advantage? If you disagree, please explain why. 

Subjective decisions can bias the results in ways that are not necessarily even 
intentional, conscious, or politically motivated. The most frequent and most 
likely way for this to happen is for personnel at various levels of the ICM 
effort — particularly interviewers, matchers, and the managers and statisticians 
responsible for implementing the methodology — to be influenced in their 
subjective decisions by their expectations about undercount. For example, 
when the match status of a particular record is not clear, it is possible for the 
classification to be influenced by whether the matcher expects people in that 
demographic category to have a high level of undercount. When a PES 
interviewer fabricates data on a hot or rainy day for people who never seem to 
be at home, the characteristics assigned to those people will naturally reflect 
the expectations of that interviewer. When a decision is made about whether to 
send a group of records back for re-matching or to downweight a group of 
records as outliers, that decision can influenced by whether the initial findings 
for those records were consistent with expectations about undercount and by 
whether the overall level of apparent undercount is higher or lower than 
expected. 

15. Dr. Brown also testified that even after the non-response follow-up phase 
of the census is complete, there “would still [be] the undercount problem 
of those people who just refuse to be counted or are very difficult to 
count.” Do you agree with that statement? If you disagree, please explain 
why. 

I agree with that statement. A substantial portion of this problem is already 
handled through the Census Bureau’s traditional “imputation” or “substitution” 
process for non-respondents and partial respondents. The importance of this 
element of the census process is frequently overlooked and, as explained in the 
answers to questions 2 and 12, this process can be improved. The remainder of 
the problem, as explained in the answers to questions 2 and 8, can be better 


- 16 - 



201 


solved through an estimate of undercount rather than an adjustment for 
undercount. 

16. With regard to the post-enumeration survey in the 1990 census. Dr. Brown 
testified that many of the difficulties with the procedure “can be traced to 
the fact that the PES sample was much too small to support the kind of 
objective, reliable analyses that are desired.” Do you agree with that? If 
you disagree, please explain why. 

One of the interesting things about measuring undercount through a post- 
enumeration survey is that the process has several fatal flaws, any one of which 
is sufficient by itself to explain why it produces such unacceptable results. One 
such flaw is sampling error due to a sample size that was insufficient to support 
the detailed stratification which the undercount adjustments require. This was 
such a big problem that there is no implausible aspect of the 1990 adjustments 
for which it is not a sufficient explanation. 

It would be a fallacy, however, to conclude that sampling error is therefore the 
only explanation or even the chief explanation for the many implausible 
aspects of the 1990 adjustment factors. There are several other documented 
problems which are also sufficient by themselves to explain them. For 
example, the documented level of uncertainty and error in matching is 
sufficient to explain any of these implausible results. The level of fabrication 
in typical surveys, which was generally confirmed by the various studies of 
fabrication in the PES, is comparable in size to undercount and sufficient to 
explain any of these implausible results. Likewise, any of the implausible 
results can be explained by the fact that such an attempt to measure a small 
component of the population is extremely sensitive to tiny errors in the 
insurmountable task of classifying the remainder of the population. (See pages 
6 through 9 of my first paper, i) It would be foolish to presume that solving 
only one of these problems would be sufficient to “fix” the proposed process 
for measuring undercount. There would be more than enough problems 
remaining to invalidate the results. 

17. The size of the sample in the Integrated Coverage Management (ICM) is 
750,000 households. Is that a proper size for such an endeavor? 

It is more than sufficient for the post-enumeration survey’s traditional role of 
evaluating census questions and procedures. However, no increase in sample 
size would be sufficient to produce valid adjustments for undercount through a 
post-enumeration survey, since sample size is not the only problem or even the 


- 17 - 



202 


chief problem. As explained in the answer to Question 16 above and in the 
papers which I submitted to the Subcommittee as testimony on 5/5/98, >.2 the 
attempt to measure undercount through a post-enumeration survey has several 
fatal flaws that are not caused by insufficient sample size. These flaws account 
for much of the estimated undercount and, since they involve non-sampling 
error, they obviously will not be reduced by enlarging the sample. In fact, an 
increased sample size, coupled with a very tight time schedule and questionable 
staffing levels, is likely to increase the problems of fabrication, proxy 
interviews, and matching error which plagued the 1990 PES. 

18. The results of the PES in 1990 showed that census was less accurate than 
its predecessor. That result was confirmed by demographic analysis, 
which has been performed on every census since 1940. We certainly know 
that the 1990 census was much more expensive than the 1980 census. Do 
you agree with the conclusion that 1990 was also less accurate than 1980? 

1 have not studied this issue in detail. However, as explained in the answer to 
questions 2 and 5 above, it is appropriate to say that the Census Bureau’s 
“demographic analysis” method indicated that the 1 980 Census was the most 
accurate in history and that the 1990 Census was only the second most accurate 
in history with respect to undercount. 

19. Please explain the difference between net over- or undercount in the 1990 
census count and actual over- and undercounts (mistakes) made in the 
1990 count. I know that a net undercount of 1.6% sounds relatively small 
but for census purposes, aren’t those 26 million mistakes a concern? 

There are three sets of terms that need to be explained: (a.) actual gross 
overcount and undercount, (b.) gross measured overcount and undercount, and 
(c.) net measured overcount and undercount. 

(a.) “Actual gross overcount” is the number of people actually counted twice 
by the census or counted in error. For example, people who were bom 
after April 1 or who died before April 1 are sometimes counted by the 
census even though they should not be. College students who are counted 
at their parents’ home instead of at the school where they lived are 
considered part of the “overcount” of their parents’ community and part of 
the “undercount” of their college community. Overcount is usually 
referred to as “erroneous enumeration.” Similarly, “actual gross 
undercount” is the number of people actually missed by the census. 


- 18 - 



203 


(b.) “Gross measured overcount” and “gross measured undercount” are 
appropriate terms for the number of people identified as erroneous 
enumerations by the Post-Enumeration Survey and the number of people 
identified as undercounted by the Post-Enumeration Survey. The “26 
million mistakes” to which the question refers represent gross measured 
overcount and gross measured undercount. These numbers are much 
higher than actual gross overcount and actual gross undercount for 
several reasons: 

• Much of the measured undercount and overcount is due to 
measurement errors in the post-enumeration survey rather than actual 
undercount and overcount in the census. This is the central point 
developed in my papers. [See pages 6 through 9 of my first paper' and 
pages 3 through 13 of my second paper. 2 ] 

• All of the people who are added to the census count through the 
substitution process and all of the people whose census responses are 
too incomplete to be used for matching are considered to be erroneous 
enumerations. The corresponding people who are found in those 
housing units by the Post-Enumeration Survey are considered to be 
part of the gross undercount. While this is appropriate in the context 
of the PES analysis, it does tend to make the gross measured overcount 
and gross measured undercount misleadingly high. 

• People who seem to be counted in the wrong location by the census 
are counted as part of the undercount in one place and part of the ' 
overcount in another. This is appropriate in the context of the PES 
analysis, but it tends to make the total number of errors appear 
misleadingly high. 

• Matching errors in the PES analysis typically involve a census record 
which should be matched with a PES record but which fails to match 
for any one of a number of reasons. In most such cases, the census 
record becomes part of the gross measured overcount and the PES 
record becomes part of the gross measured undercount. Again, this is 
appropriate in the context of the overall PES analysis, but it does tend 
to make the gross measured overcount and gross measured undercount 
misleadingly high. 

[It should be noted that matching error does not always result in 
offsetting errors in gross overcount and gross undercount. For 


- 19 - 



204 


example, if the person described by the unmatched census record 
really does exist, it might be difficult to prove that they don’t exist and 
they therefore might not become part of the measured overcount. This 
is one of the ways that matching error introduces bias into the 
undercount adjustments.] 

• Looking at the PES in a broader sense, it can be expected that the 
number of people erroneously identified as overcounted or 
undercounted will naturally tend to exceed the number of people 
erroneously identified as counted correctly. This is because only a 
very small proportion of the population is actually overcounted or 
undercounted: in other words, there are very few people at risk of 
being erroneously identified as counted correctly. However, the vast 
majority of people are counted correctly by the census, and they are 
therefore at risk of being erroneously identified as overcounted or 
undercounted. This results in a large upward bias in the gross 
measured overcount and the gross measured undercount. [This issue is 
discussed in more detail on pages 6 through 9 of my first paper.' On 
page 9 of that paper, there is a list of eighteen problems which make it 
very difficult to match people correctly between two surveys so that 
they can be classified accurately as overcounted, undercounted, or 
correctly counted.] 

(c.) “Net measured undercount” can be simply computed by subtracting gross 
measured overcount from gross measured undercount. (If an area has 
more measured overcount than measured undercount, its “net measured 
overcounf ’ can be calculated by subtracting its net measured undercount 
from its net measured overcount.) 

Thus, the frequently cited figure of “26 million mistakes” is greatly inflated, 
and it does not reflect the actual level of accuracy in the 1990 Census. 

20. I understand that improvement in the average does not necessarily mean 
that there will be improvement in every case. In 1990, there was criticism 
about the strata being broken down by region. If statistical methods are 
used in 2000, with strata broken down by state in 2000, can we expect 
more states with improved accuracy than there were in 1990? 

Since the undercount adjustments reflect error in measuring undercount more 
than they reflect undercount itself, any prediction of how the numbers will fall 


- 20 - 



205 


out in any particular census is very uncertain. With that caveat, my 
expectations are as follows: 

(a.) Estimating the adjustments for each state individually will negate most of 
the advantage otherwise gained from a larger sample size in terms of 
sampling error. 

(b.) The factors which introduced geographic bias into the 1990 undercount 
adjustments will tend to affect individual states in the same way that they 
affected regions in 1990. [See answer to Question 3 above.] 

(c.) Since state boundaries are as artificial as regional boundaries in terms of 
having a logical relationship with undercount rates, I see no reason at this 
time to expect an increase in accuracy resulting from this change in 
stratification. 

[See also the answers to questions 3 and 4 above.] 

21. Representative Sawyer pointed out that the longer the Census Bureau is in 
the field, the higher the error rate in the information collected. I believe 
that information came from one of the many GAO studies he and his 
Republican colleagues commissioned. You have stated your concern about 
the Census Bureau not being in the field for enough days in the 2000 plan. 
Can you explain the difference in opinion. 

There is no contradiction between the findings which you cite and the concern 
about trying to process more interviews with inadequate staff in a shorter 
period of time. In fact, the findings reinforce the concern. 

The higher error rates during the final weeks of follow-up do not result simply 
from “being in the field too long.” The first weeks in the field result in more 
accurate data because they involve actual interviews with people who are 
willing to be counted. The final weeks in the field result in less accurate data 
because they involve more interviews with people who have resisted repeated 
attempts to count them, more proxy interviews to “close out” cases for which a 
direct interview cannot be obtained, and more fabrication of interviews in 
response to pressure to close out as many cases as possible before the deadline. 

Shortening the amount of time in the field does not eliminate those final weeks 
of interviewing in which high error rates can be expected. The final weeks of 
interviewing will still be there, with all of their pressure to close out the 


- 21 - 



206 


difficult cases. Instead of eliminating the final weeks of interviewing, the 
current plan would, in effect, eliminate the initial weeks of interviewing in 
which lower error rates can be expected. By calling for more PES interviews 
in a shorter period of time with inadequate staff, the current plan creates a 
danger that the initial weeks of interviewing will be as error-prone as the final 
weeks of interviewing were in 1990. 

It should be noted that the accuracy problems in the final weeks of interviewing 
and the concerns about truncated time frames apply both to the census itself 
and to the post-enumeration survey. Proxy interviews, fabrication by 
interviewers, and unreliable reports by respondents are problems for the PES as 
well as for the census — in fact, they are even more serious when they occur in 
the PES. The timetable for Census 2000 involves very tight time frames for 
both the census and the PES. 

[See also the response to Questions 24(a.) and 24(c.) below.] 

22. In order to address the problem of declining public response, the GAO 
suggested exploring a radically streamlined questionnaire in future 
censuses. Would you give us your thoughts on how effective this approach 
might be in increasing response, and also its effect on perhaps diminishing 
the usefulness of census data? 

I have not studied this question in detail. 1 understand that the Census Bureau 
has concluded from its research that shortening the form would not have a large 
impact on response rates. I do know, based on the involvement of my office in 
the Census Bureau’s survey of data users and from its work in disseminating 
census data and in using census data to address needs of data users, that the 
information on both the long form and the short form is very widely used in 
both the public and private sectors. A radically shortened questionnaire would 
greatly diminish the value of the census. However, if we have a successful 
census in 2000, and if the Continuous Measurement program is adequately 
funded and successfully implemented, it should be possible to eliminate the 
long form in 2010. 

23. In its 1992 capping report on the 1990 census, the GAO concluded that 
“the results and experiences of the 1990 census demonstrate that the 
American public has grown too diverse and dynamic to be accurately 
counted solely by the traditional “headcount” approach and that 
fundamental changes must be implemented for a successful census in 


- 22 - 



207 


2000.” Do you agree with that conclusion? If you disagree, please explain 
why. 

It is not entirely fair to criticize a statement removed from its context within a 
larger report, so the following comments should not be interpreted as a 
criticism of the GAO or its 1992 report. 

(a.) First, it is important to realize that our diverse and dynamic population is 
not a new development. Our history has included settlement of the 
frontier, Indian wars, emancipation of slaves, massive foreign 
immigration, industrialization, urbanization, the Great Depression, 
suburbanization, inter-state redistribution of population, and many other 
events and changes that have always made our population diverse, 
dynamic, and challenging to count. As difficult as it is to develop a 
precise Master Address File for Detroit in 1998, it would have been far 
more difficult in 1898. 

(b.) I agree with the notion that there is considerable room for improvement in 
the census and that census methods should adapt to changes in the 
population. However, I am not sure exactly what is meant by 
“fundamental” changes. The concept of finding out how many people 
there are by counting them is sound, and 1 would characterize the required 
improvements as “incremental” rather than “fundamental.” 

(c.) The deficiencies of the census require not simply “change” but rather 
“change for the better.” It should be clear from my testimony and the 
testimony of the other members of the 5/5/98 panel that the particular uses 
of sampling that have been proposed for Census 2000 would be very 
serious changes for the worse. 

(d.) The 1990 Census approached our “diverse and dynamic” society, in which 
it is often difficult to find people at home, through a mail-back census 
form with instructions available in 34 different languages. It is somewhat 
ironic that the innovation proposed for dealing with these problems is a 
post-enumeration survey that relies exclusively on personal interviews by 
enumerators, most of whom speak fewer than 34 languages. The 
proposed innovation is more poorly adapted to our diverse and mobile 
society than the census itself. 

24. (a.) After the 1990 census, GAO concluded that “the amount of error in 
the census increases precipitously as time and effort are extended to count 
the last few percentages of the population... This increase in the rate of 


- 23 - 



208 


error shows that extended reliance on field follow-up activities represents 
a losing trade-off between augmenting the count and adding more errors.” 
In the last months of the follow-up efforts in 1990, GAO estimated that the 
error rates approached 30 percent, and that this problem was probably 
exacerbated by the use of close-out procedures. This appears to be a 
problem inherent to the methodology of the 1990 census. Don’t you agree? 

It is inherent not just to the census, but to any survey which must obtain 
information about people who are difficult to reach or resistant to being 
counted. These problems apply even more to Sampling for Non-Response and 
to the post-enumeration survey required for Integrated Coverage Measurement 
than they do the census itself. These efforts not only involve exhaustive 
follow-up of difficult cases, but any errors will be multiplied when the sample 
results are inflated to represent the sampled universe. In fact, given the 
proposed constraints of time and resources discussed under Question 2 1 above, 
the proposed plans for Census 2000 can be expected to make these problems 
even worse. Again, it must be stressed that we need not just “change,” but 
“change for the better.” The proposed changes are even more susceptible to 
this problem than the old procedure was. 

[See also the response to Question 21 above and Question 24 (c.) below.] 

(b.) Do you have any information on the error rates for information 
gathered using close-out procedures? 

The Census Bureau would be the most authoritative source for such 
information. 

(c.) Even if sampling is not perfect, isn’t its error rate well below the levels 
for the last percentages of the population using more traditional follow-up 
procedures? 

The premise underlying this question appears to be that sampling is somehow 
an alternative to traditional follow-up procedures. However, traditional 
follow-up procedures are just as much a part of the proposed uses of sampling 
as they are of the conventional census: follow-up is a critical part of Integrated 
Coverage Measurement, and follow-up is what Sampling for Non-Response is 
all about. Both of these efforts involve exhaustive efforts to obtain information 
about that last percentage of the population, and the associated errors will be 
compounded when the sample findings are inflated to represent the sampled 
universe. The pertinent comparisons would therefore be between the overall 
error of the traditional census and the overall error of the modified census, or 


- 24 - 



209 


else between the error resulting from close-out procedures for the samples and 
the error resulting from close-out procedures for a traditional census. It should 
be obvious from the discussion above that these comparisons would not be 
favorable to the proposed sampling methodology. 

That having been said, we are still left with a question about the overall error 
rate for sampling. With regard to sampling for undercount, a Census Bureau 
report estimated that identified errors accounted for about 33% of the net 
undercount suggested by the 1990 PES. A subsequent analysis by the same 
author raised this estimate to about 57%, and a further analysis by Dr. Leo 
Breiman raised the estimate to about 70%. (These reports are cited on pages 
1 1 -1 3 of my second paper. 2 ) Similarly, the Census Bureau’s Report of the 
Committee on Adjustment of Postcensal Estimates (the “CAPE Report,” 
released on 8/7/92) stated that “about 45% of the revised estimated undercount 
is actually measured bias and not measured undercount. In 7 of the 10 
evaluation strata, 50% or more of the estimated undercount is bias.” These 
error rates compare unfavorably with error rates for virtually any aspect of the 
census process, regardless of whether or not such comparisons can be 
pertinently drawn. 

(d.) If this is the case, then doesn’t that logically lead to GAO’s and the 
Commerce Department’s Inspector General’s conclusion that sampling at 
least a portion of the nonresponding households would increase the 
accuracy and decrease the cost of conducting the census? 

Even if the sampling methodologies did not share the census’s reliance on 
error-prone efforts to resolve difficult cases, the issues raised in the response to 
Question 12 above would still be pertinent. While there may be a place for 
sampling in improving the census, the particular procedure proposed for 
sampling nonrespondents appears to have some serious shortcomings. 

25. GAO also concluded after the 1990 census that a high level of public 
cooperation is key to obtaining an accurate census at reasonable cost. 
Unfortunately, the mail response rate has fallen with every census since 
1970, and was only approximately 65% in 1990. The reasons for this 
decline are in many instances outside of the Census Bureau’s control, for 
example the increase in commercial mail and telephone solicitations and in 
nontraditional household arrangements. For these reasons, the Bureau is 
planning a public education campaign for the 2000 census, surpassing any 
previous attempts. Given the response in 1990, do you believe this is 
money well spent? 


- 25 - 



210 


Do you believe that this public education campaign can succeed in 
arresting the decline in response rates? 

Even if it does, wouldn’t some use of sampling be warranted to solve the 
problems associated with reaching the last few percentages of 
nonresponding households? 

Taking the last question first, some of the appropriate and inappropriate uses of 
sampling with respect to non-response are addressed in the answer to Question 
12 above. 

I agree that a high level of public cooperation and a high response rate are keys 
both to obtaining an accurate census and to holding down costs. While I have 
not reviewed the Census Bureau’s publicity plans, I understand that they 
involve improvements to both the quality and the timing of the publicity 
efforts. (See also the answers to Question 2 and Question 5 above regarding 
the success of “traditional methods” in improving census participation.) 

It should be noted that the issue of undercount adjustment also has very 
significant implications for levels of public cooperation and response: 

• On the one hand, there is reason to believe that a decision to adjust the 
census would have a very serious negative effect on census participation. If 
people expect the census count to be adjusted, they may not think that the 
effort required to complete their census form is necessary. Similarly, the 
critical involvement of public officials and temporary census employees in 
securing high participation rates might be jeopardized by a decision to 
adjust the census. In the “Notice of Final Decision” on adjustment of the 
1990 Census, then-Secretary of Commerce Robert Mosbacher wrote: 

I am worried that an adjustment would remove the incentive of 
states and localities to join in the effort to get a full and complete 
count. The Census Bureau relies heavily on the active support of 
state and local leaders to encourage census participation in their 
communities... If civic leaders and local officials believe that an 
adjustment will rectify the failures in the census, they will be 
hard pressed to justify putting census outreach programs above 
the many other needs clamoring for their limited resources. 
Without the partnership of states and cities in creating public 
awareness and a sense of involvement in the census, the result is 


- 26 - 



211 


• likely to be a further decline in participation. [Congressional 
Record, 7/22/91, page 33584.] 

There is a real risk that, with an expectation of a correction 
through adjustment, the field staff would not have the same sense 
of commitment and public mission in future censuses and, as a 
result, careless and incomplete work would increase, thereby 
decreasing the quality of census data. These are the workers the 
Bureau depends on to collect the data from the groups that are 
hardest to enumerate. If these data suffer, the information lost at 
the margin is information that is especially important to policy 
development. [Congressional Record, 7/22/91, page 33605.] 

• On the other hand, the current controversy over adjustment may play a 
positive role in encouraging census participation. This controversy has 
increased awareness of the importance of being included in the census on the 
part of civic leaders, local government officials, civil rights organizations, 
and the general public. It might be possible to translate this awareness into 
something that everybody will find superior to an adjustment for undercount: 
a census in which people get counted the first time. 


Thank you again for the opportunity to address these questions. I hope that these 
answers promote a greater understanding of the issues surrounding census 
undercount adjustment and that the resulting dialog will lead to a better census. 

Sincerely, 

Kenneth J. Darga, Senior Demographer 
Michigan Department of Management and Budget 


* Kenneth J. Darga, “Straining Out Gnats and Swallowing Camels: The Perils of Adjusting for Census Undercount," 
Office of the State Demographer, Michigan Information Center, Michigan Department of Management and Budget. 

Paper submitted as testimony to the House Subcommittee on the Census, May 5, 1998. 

^Kenneth J Darga, “Quantifying Measurement Error and Bias in the 1990 Undercount Estimates,” Office of the 
State Demographer, Michigan Information Center. Michigan Department of Management and Budget. Paper submitted 
as testimony to the House Subcommittee on the Census, May 5. 1998. 


- 27 - 








213 


1 970 census. That is an increase in real dollar cost per household of 250 percent 
with no improvement in the differentia! undercount. Does that suggest to you that 
spending more on traditional methods will reduce the differential undercount? 

6. Demographic analysis showed higher undercounts of African Americans than the 
undercounts demonstrated by the Post Enumeration Survey. That suggests that 
the Post Enumeration Survey understates, not overstates, the undercount, 
especially for minorities. In other words, isn’t it likely that the 1990 census 
missed more African-Americans that would have been added back into the census 
by the Post Enumeration Survey? 

7. You have talked a lot about bias in the Post Enumeration Survey but have not 
talked much about the bias in the census. The differential undercount measured 
by demographic analysis shows that bias in the census is quite real. If there is no 
Integrated Coverage Measurement, is it not the case that this bias in the census 
will continue? 

8. Do you believe that it is acceptable for the census to consistently miss certain 
segments of the population -- Africans Americans, Latinos, Asian Americans, 
poor people in rural and urban communities — at greater rates than the White 
population? If that is not acceptable, what do you propose be done to reduce the 
differential undercount? Can you offer any evidence that you proposal(s) will 
reduce the differential undercount? 

9. It has been stated that one of the faults of the 1990 PES was correlation bias. Can 
you explain correlation bias? I understand that it is the likelihood that the people 
missed in the census may be the same people missed in the PES. Said another 
way, both the census and the survey miss the same people, for example, young 
Black males. How does correlation bias affect the accuracy count of those 
traditionally undercounted, Blacks, Hispanics, Asians, Native Americans, renters? 

10. Wouldn’t the only risk of correlation bias be minimization of the undercount 
rather than an overestimation the undercount? 

11. In testimony before the Senate Committee on Governmental Affairs 
approximately one year ago, Dr. Lawrence Brown, Professor of Statistics at the 
University of Pennsylvania, stated that, “Statistical sampling methods can be used 
in an effective and objective way to assist the census process.” Do you agree with 
Dr. Brown’s statement? If you disagree, please explain why. 

12. Dr. Lawrence Brown also testified before Senator Thompson that the Sampling 
for Nonresponse Follow-up plan “is an objective procedure all the way around 
[and] has a very good chance of working as desired.” Do you agree with that 
statement? If you disagree, please explain why. 



214 


13. In addition. Dr. Brown testified that the Census Bureau’s 2000 census plan had 
been “drastically simplified and improved. ...[these changes] make it possible to 
now believe that the Integrated Coverage Measurement might work as well as 
desired to correct the undercount.” Do you agree with that statement? If you 
disagree, please explain why. 

14. With regard to concerns that the Integrated Coverage Measurement process could 
be manipulated to achieve a particular outcome in terms of the population counts. 
Dr. Brown testified that, “if all of this planning is done in advance, it is very, very 
hard for me to see how one could direct these subjective decisions towards any 
desired goal.” Do you agree with Dr. Brown that if the procedures and protocols 
for the Integrated Coverage Measurement are set forth in advance and subject to 
expert and public scrutiny, that it is very unlikely that the sampling and statistical 
estimation process will be subject to manipulation, possibly for political 
advantage? If you disagree, please explain why. 

15. Dr. Brown also testified that even after the non-response follow-up phase of the 
census is complete, there “would still [be] the undercount problem of those people 
who just refuse to be counted or are very difficult to count.” Do you agree with 
that statement? If you disagree, please explain why. 

16. With regard to the post-enumeration survey in the 1990 census, Dr. Brown 
testified that many of the difficulties with the procedure “can be traced to the fact 
that the PES sample was much too small to support the kind of objective, reliable 
analyses that are desired.” Do you agree with that? If you disagree, please 
explain why. 

1 7. The size of the sample in the Integrated Coverage Management (ICM) is 750,000 
households. Is that a proper size for such an endeavor? 

1 8. The results of the PES in 1 990 showed that census was less accurate than its 
predecessor. That result was confirmed by demographic analysis, which has been 
performed on every census since 1940. We certainly know that the 1990 census . 
was much more expensive than the 1980 census. Do you agree with the 
conclusion that 1990 was less also less accurate than 1980? 

19. Please explain the difference between net over- or undercount in the 1990 census 
count and actual over- and undercounts (mistakes) made int he 1 990 count. I 
know that a net undercount of 1 .6% sounds relatively small but for census 
purposes, aren’t those 26 million mistakes a concern? 

20. I understand that improvement in the average does not necessarily mean that there 
will be improvement in every case. In 1990, there was criticism about the strata 
being broken down by region. If statistical methods are used in 2000, with strata 



215 


broken down by state in 2000, can we expect more states with improved accuracy 
than there were in 1990? 

2 1 . Representative Sawyer pointed out that the longer the Census Bureau is in the 
field, the higher the error rate in the information collected. I believe that 
information came from one of the many GAO studies he and his Republican 
colleagues commissioned. You have stated your concern about the Census 
Bureau not be in the field for enough days in the 2000 plan. Can you explain the 
difference in opinion? 

22. In order to address the problem of declining public response, the GAO suggested 
exploring a radically streamlined questionnaire in future censuses. Would you 
give us your thoughts on how effective this approach might be in increasing 
response, and also its effect on perhaps diminishing the usefulness of census data? 

23. In its 1992 capping report on the 1990 census, the GAO concluded that “the 
results and experiences of the 1990 census demonstrate that the American public 
has grown too diverse and dynamic to be accurately counted solely by the 
tradition ‘headcount’ approach and that fundamental changes must be 
implemented for a successful census in 2000.” Do you agree with that 
conclusion? If you disagree, please explain why. 

24. After the 1990 census, GAO concluded that “the amount of error in the census 
increases precipitously as time and effort are extended to count the last few 
percentages of the population. ...This increase in the rate of error shows that 
extended reliance on field follow-up activities represents a losing trade-off 
between augmenting the count and adding more errors.” In the last months of the 
follow-up efforts in 1990, GAO estimated that the error rates approached 30 
percent, and that this problem was probably exacerbated by the use of close-out 
procedures. This appears to be a problem inherent to the methodology of the 
1990 census. Don’t you agree? 

Do you have any information on the error rates for information gathered using 
close-out procedures? 

Even if sampling is not perfect, isn’t its error rate well below the levels for the last 
percentages of the population using more traditional follow-up procedures? 

If this is the case, then doesn’t that logically lead to GAO’s and the Commerce 
Department’s Inspector General’s conclusion that sampling at least a portion of 
the nonresponding households would increase the accuracy and decrease the cost 
of conducting the census? 

25. GAO also concluded after the 1990 census that a high level of public cooperation 
is key to obtaining an accurate census at reasonable cost. Unfortunately the mail 



216 


response rate has fallen with every census since 1970, and was only 
approximately 65 percent in 1990. The reasons for this decline are in many 
instances outside of the Census Bureaus control, for example the increase in 
commercial mail and telephone solicitations and in nontraditional household 
arrangements. For these reasons, the Bureau is planning a public education 
campaign for the 2000 census, surpassing any previous attempts. Given the 
response in 1990, do you believe this is money well-spent? 

Do you believe that this public education campaign can succeed in arresting the 
decline in response rates? 

Even if it does, wouldn’t some use of sampling be warranted to solve the 
problems associated with reaching the last few percentages of nonresponding 
households? 

My questions and your answers will be part of the permanent record of the May 5, 1 998, hearing. 
Again, thank you for your impute into this most important process. 

Sincerely, 

(L^-4 

Carolyn B. Maloney 
Ranking Minority Member 
Subcommittee on the Census 



cc: The Honorable Dan Miller 



217 


Responses to Representative Maloney - 

X. Can you tell us about a statistical or scientific activity that you've worked on that either 
worked perfectly the first time you tried it, or that didn't work as well as you had hoped 
the first time so you abandoned the idea altogether without making an effort to improve or 
redesign it? 

A. I have been a hard-nosed advocate of "getting it right the first time" for many years, and won 
over many of my colleagues at OMB. This approach emphasizes planning that not only covers 
what you expect, but also is robust with respect to things that may go wrong. It works well even 
when it doesn’t work perfectly. 


2. Despite the fact that the Census Bureau made improving the count among minorities a 
major goal of the 1990 Census, the 4.4 percent differential in the 1990 undercount between 
Blacks and non-Blacks was the highest ever recorded. Experts have repeatedly said that 
spending more money on traditional methods will not reduce this differential. If not 
through statistics, how do you propose to reduce this differential? 

A. The undercount is by its nature a nonresponse problem (or in many cases a refusal problem). 
A refusal rate as low as 1% would be considered an outstanding achievement for most surveys. 
There are two factors that draw attention to this particular problem in the Census: 

a) The problem of missing data in the census data base is much larger than this, but these 
other gaps are susceptible to "imputation," which, by its nature, leaves little or no 
evidence of the alteration (imputed items, for better or worse, are designed to look like 
the actual data on which they are based). The evidence of the undercount gap is external. 
The comparative estimates may not be completely accurate, but they are highly visible. 

b) The requirement for a complete enumeration goes substantially beyond the arena 
where statistical methods are most effective (and beyond the arena where the usual 
statistical standards for an acceptable level of error can be applied). 

On this second point, comparison with the "voluntary" income tax system is instructive. IRS has 
used statistical principles in its approach to tax returns (e g., it offers taxpayers the option of 
selecting an imputed average minimum deduction, the "standard deduction," in lieu of 
documenting actual deductions), but an explicit statistical adjustment to the tax liability of 
willing taxpayers to offset the loss due to those who refuse to file would be considered arbitrary 
and capricious. Rather we live with the consequences (reduced revenue) and IRS spends an 
extraordinary amount of research and auditing effort to discover those who refuse to pay or 
underpay their taxes and correcting these problems on a case by case basis. Such a statistical 
adjustment strategy could eliminate the estimated revenue shortfall and make the revenue per 
person with tax liabilities more "accurate" on the average, but it would be less accurate for 
almost every individual who actually pays his or her taxes. Most statistical methods do not deal 
well with the issues of individual fairness that are critical to administrative systems such as the 
tax system or to an "actual enumeration” intended to provide "fair" representation for every 



218 


individual. 

Some lessons of the tax system are also useful for a census. Look for classes of individuals who 
represent a disproportionate share of the refusals (the differential undercount), and use this 
information to refine and focus your strategy for bringing those individuals into the system. 
Demographic Analysis has provided some of the most important insights into the gross 
characteristics of census refusals, but Dual System estimates of those characteristics (because of 
the large confounded bias component and the substantial inconsistency with DA results) may be 
counterproductive for such a strategy (e.g., if attributes that arise from this very large bias are 
misinterpreted as actual attributes of the refusal population, efforts and funds may be 
substantially mistargetted). 

"Traditional" efforts have missed some important opportunities to improve response in the past 
and I have commented on these in my responses to questions #8, #23, and #25 below. 


3. You have mentioned your concerns about block level accuracy. Can you discuss you 
(sic) thoughts on the accuracy of census numbers at the state level if Dual System 
Estimation is used in 2000? Do you have any evidence that suggests that the census counts 
will be more accurate at the state level in 2000 if DSE is not used? 

A. I did not comment on block-level accuracy per se. Three important attributes of an 
enumeration that must not be discarded lightly are 1 ) simple robustness, 2) uniform accuracy in 
both large and small areas and 3) additivity. As a practical matter, block level accuracy is one 
way these properties can be substantially preserved. I have commented on the shortcomings of 
compromise accuracy targets in my response to Miller question #1 . As to the performance of 
DSE, 1 only note that it failed even the test of state-level accuracy in the one large scale 
evaluation we have (the 1990 PES) — this is elaborated in my response to question #4 below. 


4. Secretary Mosbacher, in testimony before both the House and the Senate, said that the 
Post Enumeration Survey would make the majority of the states more accurate. Is that 
statement correct? If so, why is his testimony so at odds with your testimony? 

A The statement by Secretary Mosbacher in 1991 was based on the original Census Bureau 
adjustment estimate, which was later found by the Census Bureau to be substantially inaccurate. 
Indeed the Secretary noted that the tally of 29 more accurate versus 21 less accurate, based on the 
original 2.1% adjustment, had already been reversed (to about 21-23 more accurate and 27-29 
less accurate) based on the findings of independent analysts. A year later the census Bureau 
acknowledged that the original adjustments were substantially in error, revising the overall PES 
undercount estimate downward to 1 .58%. By August of 1 992, the CAPE report had been 
completed and it showed that even the revised estimate overstated the PES measured undercount 
by an amount much larger than the July 1992 "correction"; in other words the revised 
undercount data set reflected in about equal parts the characteristics of the undercounted and the 
characteristics of measured bias. 


2 



219 


The 1991 statement by the Secretary reflected what the Census Bureau had told him at the time. 
Due in large part to the efforts of the Secretary in commissioning the comprehensive CAPE 
evaluation, the 1991 data set on which these 1991 claims were based turned out to be more error 
than fact. 


5. The 1990 census cost 20 percent more per household in real dollars than the 1980 
census. The 1980 census cost twice as much per household in real dollars as the 1970 
census. That is an increase in real dollar cost per household of 250 percent with no 
improvement in the differential undercount. Does that suggest to you that spending more 
on traditional methods will reduce the differential undercount? 

A. As I indicated in my response to question #2, the undercount is a nonresponse problem more 
than a design problem. While I was at OMB, I was consistently critical of Census Bureau 
arguments that revolved around the premise that "we have always done it that way.” OMB 
regulations prevent me from discussing the detailed information that came to me in the course of 
my employment, but suffice it to say that we made a number of recommendations for improving 
response to the Census prior to 1 990. I am gratified that the Census Bureau under Martha Riche 
adopted some of these recommendations for the 2000 census. Other recommendations for 
improving the performance of the count have never been acted on by the Census Bureau (1 have 
commented on some of these in my response to question #8). Because of the high visibility of 
this undertaking the Census Bureau has been risk-averse for a long time. It has been slow to 
embrace response theories (due as much to cognitive psychologists as to statisticians) that have 
proven very effective over the past two decades. 


6 . Demographic analysis showed higher undercounts of African Americans than the 
undercounts demonstrated by the Post Enumeration Survey. That suggests that the Post 
Enumeration Survey understates, not overstates, the undercount, especially for minorities. 
In other words, isn't it likely that the 1990 census missed more African-Americans that 
would have been added back into the census by the Post Enumeration Survey? 

A. The substantial inconsistencies between the picture of the undercount population implied by 
demographic analysis (DA) and that implied by the 1990 Post Enumeration Survey (DSE 
methodology) were of great concern to the expert panel supporting the CAPE evaluation. In 
comparing the two, the panel drew attention (in Attachment 8) to several sources of error in the 
DA estimates — some that would generally exaggerate the number of persons not counted and 
others that would specifically exaggerate the number of blacks and Hispanics not counted. 
While DA may be subject to these biases, they pale in comparison to documented biases in the 
PES. The DSE methodology used in the PES was able to measure undercounted persons 
accounting for, at best, about 0.9 percent of the population. 

If the undercounted population is any larger than this (as is implied by DA), then the racial and 
ethnic characteristics of the missing group are unknown to PES (in DSE there is no data on the 
missing group). This is why the expert panel pressed the Census Bureau so persistently to 


3 



220 


remove this large remaining bias — unless and until that large bias is removed, it is impossible to 
isolate the true racial and ethnic characteristics of the (measured) DSE undercount group from 
the spurious racial and ethnic characteristics attributable to the millions of spurious undercount 
cases inferred by DSE but actually contributed by the DSE bias processes. [Note: the previous 
sentence refers only to the net effect of bias — the errors due to DSE that produced this measured 
bias in the net figure actually contributed much larger numbers of spurious undercounts and 
spurious overcounts inferred by the DSE methodology — some of these spurious inferences may 
be related to the numbers that appear in question #19.) 

In short, DA may somewhat exaggerate the number of persons undercounted, but the DSE 
methodology measured a far smaller undercount group than that implied by DA, and could not 
estimate the racial and ethnic composition of this measured undercount group with any accuracy 
in the presence of DSE’s very large measured bias. 

If you could remove the bias from the dual system PES estimates, and you were willing to make 
the leap of faith (as we are asked to do in the 2000 Census plan) that DSE works, (i.e., its 
measured undercount group accurately represents the actual undercount), you are left with the 
unavoidable conclusion that demographic analysis substantially overstated the 1990 undercount 
(probably by exaggerating black and Hispanic components of the differential undercount) and 
that the 1990 Census was the most accurate census in history. 


7. Yon have talked a lot about bias in the Post Enumeration Survey but have not talked 
mnch about the bias in the census. The differential undercount measured by demographic 
analysis shows that bias in the census is quite real. If there is no Integrated Coverage 
Measurement, is it not the case that this bias in the census will continue? 

A. There are several types of bias in the count. They generally reflect the kind of nonsampling 
error statisticians classify as "nonresponse," and reflect a level of nonresponse that would be 
considered trivial in almost any sample survey (remember that the accuracy standards expected 
of an enumeration vastly exceed those that sample surveys are typically capable of meeting). 
Based on external benchmarks (demographic analysis), there is an overall downward bias in the 
count of about one percent. If this bias were uniform, it would make virtually no difference to 
the objective of supporting an accurate apportionment. The same external benchmark suggests 
that the bias may not be uniform across all potential Congressional districts. The evidence here 
is in proxy demographic variables, e.g., race and ethnicity. Once again, if the racial and ethnic 
characteristics of the populations in each potential Congressional district were uniform, there 
would be virtually no effect on the accuracy of apportionment. So one must explore the 
mechanisms that produce these differences. 

If the Census Bureau address listing methodology disproportionately misses black or Hispanic 
households, then this error affects holh the count and the DSE. Likewise, if a disproportionate 
number of black and Hispanic households deliberately avoid participation the census, they will 
be missed by both the count and the DSE. DSE is blind to these particular types of errors and 
impotent to "correct" them. The presence or absence of 1CM has no effect on errors of this kind. 


4 



221 


The bias they produce will persist until the nonresponse problems are addressed directly by such 
things as better listing, more effective follow-up (refusal conversion), and eliminating root causes 
of mistrust. 

The value of coverage measurement is a different story. The DSE methodology requires data 
and is thus not very useful for dealing with true non-response, but this is not to say that a well- 
designed coverage measurement program cannot contribute significantly to improving the 
accuracy of the count. During the period when my OMB colleague, the late Maria Gonzalez, 
was reviewing plans for the 1 990 Census, we discussed the role of coverage measurement at 
length. I had the temerity to suggest that, with all the expectations placed on the PES, the sample 
should be larger. Maria gently took me to task, pointing out that a sample small enough to be 
performed quickly by an expert staff could detect performance problems and errors early enough 
to correct problems in the count. (These advantages of a small manageable sample are also 
reflected in the September 1 996 Report of the American Statistical Association Blue Ribbon 
Panel on Uses of Sampling in the Census.) This potential for feedback is the critical difference 
between effective quality control and simple quality measurement. 


8. Do you believe that it is acceptable for the census to consistently miss certain segments 
of the population — Africans (sic) Americans, Latinos, Asian Americans, poor people in 
rural and urban communities — at greater rates than the White population? If that is not 
acceptable, what do you propose be done to reduce the differential undercount? Can you 
offer any evidence that you (sic) proposal(s) will reduce the differential undercount? 

A. I believe most of these problems must be recognized and addressed for what they are - 
deficiencies in performance and highly motivated refusals. The Census Bureau is aware of large 
differences in response performance even among its regular staff, but has been reluctant to admit 
or address this problem administratively for a variety of reasons (possible litigation risk?). After 
the 1980 Census, focus groups were conducted in high undercount areas — the results were 
reported by GAO. This research identified some strong, perfectly rational motivations for 
resisting the Census, most of them related to the numerous questions on the long form. This is a 
dilemma for the Census Bureau. Local agencies lobby heavily to retain the questions so that they 
can use block-level data to target such programs as housing code enforcement. But people who 
have observed the housing inspectors moving into their neighborhoods after the block-level data 
are released can make the connection for themselves - and tell their neighbors about it next time. 
Even those who simply wonder why those questions are there can put two and two together. 

The attraction of the rich, geographically detailed Census research data base for enforcement 
authorities creates some perverse incentives. For anyone who may have reason to avoid the 
notice of enforcement authorities, the only safe course may be to avoid the census altogether or 
to file a false report that appears consistent with local regulations (e.g., omit listing some 
occupants to avoid evidence of overcrowding). 

Some steps have already been taken - 


5 



222 


a) Making more forms available for willing respondents may offset differential problems 
in the mailing list (though I still have some concerns here about double counting or other 
fictitious reports). 

b) Reducing the number of questions on the long form may help if the changes reduce 
suspicions among groups with high refusal rates. 

One other way to address this problem is to decouple the research component (the long form) of 
the Census from the count (the short form). The temptation to piggy-back this huge sample 
(about 20 million) on the basic count has outweighed the possibility of reducing the undercount. 
The Continuous Measurement program had the potential for accomplishing this, but this has 
been deferred, trading off potential reductions in undercount for 2000 for one more bite at a 
huge long form sample. 

Yet another possibility that has been proposed is to reduce the size of the long-fotm sample. 
Ironically, the long form sample size claimed to be the "minimum" needed to gather research 
information adequate to make program decisions is about 25 times larger than the sample size 
proposed to "correct" the constitutionally required count. I suspect these sampling judgments 
have more to say about the priorities of the research community and various bureaucracies than 
any reasonable statistical calculation. If the sample size of the research component is reduced 
across the board, then some resources are freed up for more extensive follow-up of the count. If 
the research community will not stand for this, even a reduction of the long form sampling rate in 
high undercount areas would be useful. There are also sound technical reasons for reducing the 
sampling rate in areas of high population density (what determines sampling error is sample size, 
but sampling rates are easier to sell to nonstatisticians — the resulting distortion has affected the 
sample designs for both the long form and coverage measurement.). 


9. It has been stated that one of the faults of the 1990 PES was correlation bias. Can you 
explain correlation bias? I understand that it is the likelihood that the people missed in the 
census may be the same people missed in the PES. Said another way, both the census and 
the survey miss the same people, for example, young Black males. How does correlation 
bias affect the accuracy count of those traditionally undercounted. Blacks, Hispanics, 
Asians, Native Americans, renters? 

A. There are two reasons for using the term "correlation bias." The first has to do with the fact 
that the two samples being compared are not independent. Census argues that this effect is small. 
Another reason is that cases missing from either the count and/or the PES are correlated. 
However for the critical "4th cell" (the unobserved cases missing from both), your interpretation 
is correct — some unknown number of missed cases (for which there are no data whatever) are 
assumed to make up this cell. But since there are no data, there are no attributes (black, 

Hispanic, young, male, etc.) to measure. Correlation bias is a property of the DSE methodology, 
not the underlying count. It represents that hypothetical portion of the undercount for which 
neither the count nor the follow-up survey have produced any useful information. Even the size 
of this bias can only be inferred indirectly by reference to other information. In other words it 


6 



223 


represents a (hopefully small) chunk of ignorance which adds nothing to our knowledge of the 
undercount. 

Its effect on the accuracy of estimated characteristics of the undercounted population must also 
be inferred indirectly. The CAPE report breaks down the DSE (revised) estimate of 1 .58 percent 
undercount as 0.85% actual measured undercount and 0.73% measured bias. Note that the 
correlation bias has not yet appeared. The 0.7 1% bias reflects the measurable errors made in the 
DSE analysis, e.g., correct counts incorrectly classified as overcounts or undercounts. If the 
characteristics of these bias cases could be removed (as the expert panel urged) then what would 
be left would be a set of accurate characteristics (race, ethnicity, etc.) of the actual undercount 
group found by DES. But the Census Bureau determined that the processes available to remove 
the effect of the bias would add additional error. So the only characteristics of the 1 .58% DSE 
estimate of undercount that can be tabulated consist of the unknown real characteristics of the 
true undercount group mixed with the unknown spurious characteristics contributed by the bias 
group. Since the detailed characteristics of the undercount group in DSE are knowable but are, 
in fact, unknown, any adjustment for the offsetting "correlation bias" (whose characteristics are 
unknowable within DSE by definition) can only be made at the most aggregate level. Based on 
other information, the Census Bureau estimated the size of the correlation bias (relative to the 
1 .58% estimate) at 0.38%, leaving an overall estimate of about 1 .2% undercount consisting of 
0.85% actual measured undercount within DSE (net of measured bias) plus 0.38% undercount 
missed by DSE. About the only thing we do know after all this is that the results of DSE are 
clearly inconsistent with the 1 .8% undercount estimate derived from demographic analysis. 

On another level, however, we can indirectly infer something about the accuracy of undercount 
characteristics that are theoretically observable in DSE. If we could remove the measured bias, 
and if the correlation bias were large enough to account for the difference between the 0.85% 
measured undercount observed in DSE and the 1 .8% total undercount implied by DA, we would 
be left in the position of trying to estimate the characteristics of the whole undercount population 
(1.8%) from DSE information that represents only a minor fragment (0.85%) of that population. 
This is why I asserted that the DSE methodology becomes unreliable if the correlation bias is 
large. 


10. Wouldn't the only risk of correlation bias be minimization of the undercount rather 
than an overestimation (sic) the undercount? 

A. No. The risk if the correlation bias is large is that you have no information about most of the 
undercount population. Statistical estimates based on no information are notoriously unreliable. 


11. In testimony before the Senate Committee on Governmental Affairs approximately one 
year ago. Dr. Lawrence Brown, Professor of Statistics at the University of Pennsylvania, 
stated that, "Statistical sampling methods can be used in an effective and objective way to 
assist the census process." Do you agree with Dr. Brown's statement? If yon disagree, 
please explain why. 


7 



224 


A. Yes, I agree. Sampling methods have been used in this way for over 50 years. 


12. Dr. Brown also testified before Senator Thompson that the Sampling for Nonresponse 
Follow-up plan "is an objective procedure all the way around (and) has a very good chance 
of working as desired." Do you agree with that statement? If you disagree, please explain 
why. 

A. The statement is too speculative for my tastes and, as Dr. Brown noted, is based in part "on 
idealized statistical assumptions." In fairness to Dr. Brown, he also noted risk of bias, several 
real-world problems, and some potentially troublesome interactions, concluding on balance that 
"if Congress can find the money, I’d prefer to see a full follow-up rather than the current sample 
response follow-up plan." I fully agree with that conclusion. 


13. In addition. Dr. Brown testified that the Census Bureau's 2000 census plan had been 
"drastically simplified and improved. ...(these changes) make it possible to now believe that 
the Integrated Coverage Measurement might work as well as desired to correct the 
undercount." Do you agree with that statement? If you disagree, please explain why. 

A. I believe the statement with ellipsis and insertion(s) does not accurately convey the view 
expressed by Dr. Brown. There were two slightly different statements, one in his prepared 
testimony and the other in the transcript. The clearer of the two was in the transcript: 

"As of a month ago, the plans for the first stage were drastically simplified and improved, 
I believe. And these first stage changes lead me to believe that that stage can work to 
provide suitably accurate numbers. And some other changes they have announced to the 
ICM protocol make it possible to believe that that it might work as well as desired." 

[note — the written statement also included the words "to correct undercount" at the end 
of this sentence] 

The first part of the quotation is an endorsement of the improvements made to the first stage 
(sampling for follow-up) — I agree that improvements have been made. The second part of the 
quotation is an extremely guarded expression of optimism that ICM may work after all. I am 
much less sanguine than Dr. Brown. A lot depends on what is meant by suitable accuracy or 
working "as well as desired." My views are based, not on the highly charged debate in the 
literature, but on the comprehensive evaluation (CAPE) performed by the Census Bureau itself. 


14. With regard to concerns that the Integrated Coverage Measurement process could be 
manipulated to achieve a particular outcome in terms of the population counts, Dr. Brown 
testified that, "if ail of this planning is done in advance, it is very, very hard for me to see 
how one could direct these subjective decisions towards any desired goal." Do you agree 
with Dr. Brown that if the procedures and protocols for the Integrated Coverage 


8 



225 


Measurement are set forth in advance and subject to expert and public scrutiny, that it is 
very unlikely that the sampling and statistical estimation process will be subject to 
manipulation, possibly for political advantage? If you disagree, please explain why. 

A. Variations on this approach have always held some attraction for OMB, particularly when 
there was potential for subtle abuses. The results have been mixed. Public or even expert 
scrutiny may reach consensus long before it has reached the whole truth. Constraints on 
processes tend to reward conservative methods and inhibit innovative breakthroughs. But the 
most potent risks arise in the case of procedures that are not inherently robust. All the careful 
caveats devised by statisticians cannot prevent a motivated advocate from changing the results by 
altering sensitive assumptions. Some statisticians believe that they have discharged their 
professional obligations by adding caveats to a frail result. I do not share that view. 

I applaud the four minimum principles for effectiveness and objectivity advanced by Dr. Brown 
in paragraph 2 of his prepared testimony, and I would add robustness to the list for the reasons 
indicated above. I also share the concern expressed in his notes about ICM procedures: 

"such a procedure violates my principle 2(iii). But this contradiction at present seems 
unavoidable if one hopes to use reasonable ICM procedures to reduce the differential 
undercount problem below where it stood in 1990 and 1980.” [emphasis added] 


IS. Dr. Brown also testified that even after the non-response follow-up phase of the census 
is complete, there "would still (be) the undercount problem of those people who just refuse 
to be counted or are very difficult to count” Do you agree with that statement? If yon 
disagree, please explain why. 

A. I agree that there will always be some residual refusals and performance errors, but my 
response to question #8 above indicated the potential for reducing these below the currently 
accepted levels. I would certainly not write off the current level of refusals as unavoidable if that 
is what is being inferred from Dr. Brown’s statement. 


16. With regard to the post-enumeration survey in the 1990 census, Dr. Brown testified 
that many of the difficulties with the procedure "can be traced to the fact that the PES 
sample was much too small to support the kind of objective, reliable analyses that are 
desired." Do you agree with that? If you disagree, please explain why. 

A. From the content and context of his statement, I believe Dr. Brown was referring to the 
shortcomings of the PES as an analytical tool. There are also indications in the CAPE report that 
additional data might have resolved some of the intractable problems of the evaluation, for 
example the inability to remove the measured bias. On the other hand, increasing sample size 
does not generally reduce the size of such biases, and it was the size and relationship of the 
biases that was the downfall of the PES as a tool for accurately allocating the undercount. 


9 



226 


17. The size of the sample in the Integrated Coverage Management (ICM) is 750,000 
households. Is that a proper size for such an endeavor? 

A. It is much too large and unmanageable to provide quality control or even to secure the 
advantages touted by the ASA Blue Ribbon Panel (e.g., tighter control using expert staff to 
reduce nonsampling error). The Census Bureau has experienced differential performance 
problems using regular staff in samples as small as 20,000 households. And if the DSE 
methodology fails because the type of nonsampling error known as correlation bias is too large (a 
distinct possibility based on the PES experience), then any sample size is too large. 


18. The results of the PES in 1990 showed that the census was less accurate than its 
predecessor. That result was confirmed by demographic analysis, which has been 
performed on every census since 1940. We certainly know that the 1990 census was much 
more expensive that the 1980 census. Do you agree with the conclusion that 1990 was less 
(sic) also less accurate than 1980? 

A. As indicated in my response to questions # 6 and #9, the PES results were in substantial 
conflict with 1990 demographic analysis results. If you believe the DSE methodology worked, 
then 1990 looks more accurate than 1980. I tend to agree with Dr. Brown — it was pretty much a 
wash. I can still remember the wide-spread consternation with the unprecedented "surprises" and 
errors that occurred in the 1980 Census (the largest "closure error" in history by a wide margin, 
compromised quality control, and many horror stories from the field). This large deviation from 
expectations (closure error) probably made it much more difficult to detect and correct other, 
smaller discrepancies. So I would not be surprised if the performance indicators were a bit soft 
in 1980. Demographic analysis has provided the most consistent benchmark, but its methods and 
assumptions have changed over time and I doubt that comparisons are reliable to tenths of a 
percent. In 1990, there were some suspicions that the DA figures might be too high. You also 
have to make allowances for the fact that the most recent census is almost always bad-mouthed 
in the course of justifying more funds for the next one. 


19. Please explain the difference between net over- or undercount in the 1990 census count 
and actual over- and undercounts (mistakes) made in the 1990 count I know that a net 
underconnt of 1.6% sounds relatively small but for census purposes aren't those 26 million 
mistakes a concern? 

A. I have some difficulty understanding this question, but I will try to respond. First, 26 million 
would be about 10% of the count, but I am not sure what that figure refers to. Undercount, 
overcount, and the "net" are less haphazard and less precise than might be inferred from the 
details of this question. No count is perfect, so it is assumed that there are some undetected 
double counts and undetected undercounts that are reflected in the total enumeration. Since they 
are undetected, we don’t know how many there are from the count itself. By means of external 
comparisons.(principally demographic analysis), we can estimate (with some error) how far off 
the count may be. Since the external comparison doesn’t tell us anything about the mix of over- 


10 



227 


and undercount, we can logically infer things only about the net effect. In the past, coverage 
evaluation has provided some incomplete estimates of the mix of over- and undercount, but this 
should not be confused with the 1990 output of DSE (about half of those net errors were DSE 
errors, which is why they were classified as "measured bias"), (see also the response to question 
#6 above) 

I have always been chagrined that millions of people may not take the census seriously or refuse 
to participate. OMB gets some misdirected census returns that are really bizarre. OMB also gets 
both complaints and misdirected hate mail that display distrust of the Census or the Census 
Bureau or both. 


20. I understand that improvement in the average does not necessarily mean that there will 
be improvement in every case. In 1990, there was criticism about the strata being broken 
down by region. If statistical methods are used in 2000, with strata broken down by state 
in 2000, can we expect more states with improved accuracy than there were in 1990? 

A. Finer geographic stratification is a two-edged sword. In theory, criteria other than political 
boundaries should be the deciding factors — strata that are geographically diffuse can be perfectly 
valid and may perform better. Consider a case where census staff in some states are much more 
proficient at converting refusals than staffs in other states. With geographically diffuse strata, the 
effect tends to average out, but with state-based strata, the effect produces another kind of 
differential undercount that directly distorts apportionment information. It is entirely possible 
that state-based strata are a political palliative that imposes real penalties on accuracy. 


21. Representative Sawyer pointed out that the longer the Census Bureau is in the field, 
the higher the error rate in the information collected. I believe that information came from 
one of the many GAO studies he and his Republican colleagues commissioned. You have 
stated your concern about the Census Bureau not be (sic) in the field for enough days in the 
2000 plan. Can you explain the difference in opinion? 

A. The phenomenon described by Representative Sawyer is not a matter of opinion. This 
pattern is well known — it is not unique to the Census enumeration, it occurs in sample surveys 
as well (and for the same reasons). As the achieved response rate rises, productivity tends to fall 
and error rates tend to rise. Early respondents are self motivated to cooperate. Reluctant or 
forgetful respondents tend to pay less attention to the task and thus make more errors. Resistant 
respondents reached late in the process are often distracted by irritation and have less motivation 
to consider questions carefully. The stress of dealing with respondent irritation or the pressure of 
a final close-out process may also cause data collectors to make additional errors of their own. 

But while the error rate (per observation) goes up, total error is generally reduced by filling in 
gaps that contribute to non-response error with 70% or 80% or 90% accurate information. 
Depending on error characteristics, there may be a break-even point at some very high response 
rate, but this point is rarely reached in most sample survey designs because follow-up is 


11 



228 


terminated at a lower level due to cost considerations. Surveys to capture very rare attributes or 
those which are bias-sensitive may spend the extra money to reach the break-even point. The 
census enumeration probably qualifies on both counts. There is a world of difference between 
flirting with the break-even point (as may be happening in some areas in the full enumeration) 
and calling it quits after a couple of tries. 


22. In order to address the problem of declining public response, the GAO suggested 
exploring a radically streamlined questionnaire in future censuses. Would you give us your 
thoughts on how effective this approach might be in increasing response, and also its effect 
on perhaps diminishing the usefulness of census data? 

A. The GAO suggestion is sound. OMB has made similar recommendations. Though many 
other factors influence respondent cooperation, questionnaire length is one of a handful of factors 
that has consistently shown a correlation with response rates. Other attributes of a "streamlined" 
approach (visual simplicity, user-friendliness) also have a salutary effect on response. If this is 
incorporated into one of the "decoupling" strategies described in my response to question #8, 
there are opportunities for a much more sophisticated research program (e.g, more frequent 
measurement, much more powerful and efficient sample designs, etc.) which can make the data- 
rich research component more useful as well. 


23. In its 1992 capping report on the 1990 census, the GAO concluded that "the results and 
experiences of the 1990 census demonstrate that the American public has grown too diverse 
and dynamic to be accurately counted solely by the tradition (sic) 'headcount' approach 
and that fundamental changes must be implemented for a successful census in 2000." Do 
you agree with that conclusion? If you disagree, please explain why. 

A. I would go farther. I would not simply supplement the traditional headcount approach, I 
would replace most of it with a modem headcount approach. Until recently, only a few 
innovators in the Census Bureau paid much attention to the extraordinary improvements that 
have been made in mail survey methods or the lessons of cognitive psychology. But there are 
some cracks in the traditional conservative edifice. In 1990, the Census Bureau was persuaded to 
use a stratified design for the long form sample (only 5 decades after Neyman demonstrated the 
power of this technique). The Bureau has been listening seriously to some of the architects of the 
methodological advances of the 1970’s and 1980's. Modem, simpler form designs that tested 
very well but were rejected in the 1980's may have made a comeback for 2000. 

But the basic Census 2000 plan still represents 1980's thinking. Fundamental changes such as 
decoupling strategies that would liberate both the enumeration and the research component of the 
traditional census approach have been deferred because of risk averse client groups. I agree with 
GAO — the time for these fundamental changes is now. 


24. After the 1990 census, GAO concluded that "the amount of error in the census 


12 



229 


increases precipitously as time and effort are extended to count the last few percentages of 
the population. ...This increase in the rate of error shows that extended reliance on field 
foilow-up activities represents a losing trade-off between augmenting the count and adding 
more errors.” In the last months of the follow-up efforts in 1990, GAO estimated that the 
error rates approached 30 percent, and that this problem was probably exacerbated by the 
use of close-out procedures. This appears to be a problem inherent to the methodology of 
the 1990 census. Don't you agree? 

Do you have any information on the error rates for information gathered using close-out 
procedures? 

Even if sampling is not perfect, isn't its error rate well below the levels for the last 
percentages of the population using more traditional follow-up procedures? 

If this is the case, then doesn't that logically lead to GAO's and the Commerce 
Department's Inspector General's conclusion that sampling at least a portion of the 
nonresponding households would increase the accuracy and decrease the cost of conducting 
the census? 


A. I believe that GAO is discussing the same phenomenon discussed in my response to question 
#21, i.e., the increase in the per-observation error rate when pursuing high response. There 
would only be 3% total error at stake in the decision whether to pursue the last 3% of the count, 
so the 30% figure cited by GAO must be the error rate per observation. 

If cost considerations are set aside, then the (quality ) break-even point is higher than the trade- 
off point implied by the GAO statement. A per-observation error rate of 30% is clearly 
preferable, from a quality standpoint, to the 100% per-observation error rate of completely 
missing an observation (some accurate data is usually better than no data). But error phenomena 
are not so well behaved as this, so there usually is a (quality) break-even point short of 100% 
response. 

How is this affected by sampling? If you select a large sample and then pursue a 1 00% response 
rate among those selected for the sample, you will see the same rise in error rates as before. And 
then the small component due to sampling error must be added to this error. Sampling does not 
reduce the total error produced by pursuing high response rates, it increases it. 


25. GAO also concluded after the 1990 census that a high level of public cooperation is key 
to obtaining an accurate census at reasonable cost. Unfortunately the mail response rate 
has fallen with every census since 1970, and was only approximately 65 percent in 1990. 
The reasons for this decline are in many instances outside of the Census Bureaus (sic) 
control, for example the increase in commercial mail and telephone solicitations and in 
nontraditional household arrangements. For these reasons, the Bureau is planning a 
public education campaign for the 2000 census, surpassing any previous attempts. Given 


13 



230 


the response in 1990, do you believe this is money well-spent? 

Do you believe that this public education campaign can succeed in arresting the decline in 
response rates? 

Even if it does, wouldn't some use of sampling be warranted to solve the problems 
associated with reaching the last few percentages of nonresponding households? 

A. Some of the trend in mail response rates is due to the fact that the mail portion of each 
successive census covered a larger fraction of the population. In the earlier censuses, the target 
population for mail was more selective and easier to reach successfully. By 1 990 the mail 
portion was virtually the whole census and none of the problems could be avoided. But now 
that 100% has been reached, this element of the decline should plateau. Response to telephone 
surveys has been hit hard by telemarketing and call-screening technologies. Response to 
traditional mail survey methods (like the census) has also declined, but more modem mail survey 
methodologies have bucked the trend (see my response to question #23). 

One of the most effective elements in the modem revival of mail methodologies is the multiple 
contact strategy. These are almost always personalized contacts, but a "public education" 
campaign that drew attention and raised interest in the census might produce some of the same 
effect. If the campaign is as bureaucratic and condescending as its title, it probably wont produce 
that effect. 

Modernization of mail methods is the best bet for reversing the "decline" experienced by 
traditional mail methods. Don Dillman ("Total Design Method") argues that there is a synergy 
among the various elements of his method that cannot be achieved piecemeal. Before he started 
updating his book, he was making fairly regular visits to Suitland. I hope the right people were 
listening. 


14 



(L Q D CQ 





232 


5. The 1 990 census cost 20 percent more per household in real dollars than the 1980 
census. The 1980 census cost twice as much per household in real dollars as the 

1 970 census. That is an increase in real dollar cost per household of 250 percent 
with no improvement in the differential undercount. Does that suggest to you that 
spending more on traditional methods will reduce the differential undercount? 

6. Demographic analysis showed higher undercounts of African Americans than the 
undercounts demonstrated by the Post Enumeration Survey. That suggests that 
the Post Enumeration Survey understates, not overstates, the undercount, 
especially for minorities. In other words, isn’t it likely that the 1990 census 
missed more African-Americans that would have been added back into the census 
by the Post Enumeration Survey? 

7. You have talked a lot about bias in the Post Enumeration Survey but have not 
talked much about the bias in the census. The differential undercount measured 
by demographic analysis shows that bias in the census is quite real. If there is no 
Integrated Coverage Measurement, is it not the case that this bias in the census 
will continue? 

8. Do you believe that it is acceptable for the census to consistently miss certain 
segments of the population - Africans Americans, Latinos, Asian Americans, 
poor people in rural and urban communities — at greater rates than the White 
population? If that is not acceptable, what do you propose be done to reduce the 
differential undercount? Can you offer any evidence that you proposal(s) will 
reduce the differential undercount? 

9. It has been stated that one of die faults of the 1990 PES was correlation bias. Can 
you explain correlation bias? I understand that it is the likelihood that the people 
missed in the census may be the same people missed in the PES. Said another 
way, both the census and the survey miss the same people, for example, young 
Black males. How does correlation bias affect the accuracy count of those 
traditionally undercounted. Blacks, Hispanics, Asians, Native Americans, renters? 

1 0. Wouldn’t the only risk of correlation bias be minimization of the undercount 
rather than an overestimation the undercount? 

11. In testimony before the Senate Committee on Governmental Affairs 
approximately one year ago, Dr. Lawrence Brown, Professor of Statistics at the 
University of Pennsylvania, stated that, “Statistical sampling methods can be used 
in an effective and objective way to assist the census process.” Do you agree with 
Dr. Brown’s statement? If you disagree, please explain why. 

12. Dr. Lawrence Brown also testified before Senator Thompson that the Sampling 
for Nonresponse Follow-up plan “is an objective procedure all the way around 



233 


[and] has a very good chance of working as desired.” Do you agree with that 
statement? If you disagree, please explain why. 

13. In addition. Dr. Brown testified that the Census Bureau’s 2000 census plan had 
been “drastically simplified and improved. ...[these changes] make it possible to 
now believe that the Integrated Coverage Measurement might work as well as 
desired to correct the undercount." Do you agree with that statement? If you 
disagree, please explain why. 

14. With regard to concerns that the Integrated Coverage Measurement process could 
be manipulated to achieve a particular outcome in terms of the population counts, 
Dr. Brown testified that, “if all of this planning is done in advance, it is very, very 
hard for me to see how one could direct these subjective decisions towards any 
desired goal.” Do you agree with Dr. Brown that if the procedures and protocols 
for the Integrated Coverage Measurement are set forth in advance and subject to 
expert and public scrutiny, that it is very unlikely that the sampling and statistical 
estimation process will be subject to manipulation, possibly for political 
advantage? If you disagree, please explain why. 

15. Dr. Brown also testified that even after the non-response follow-up phase of the 
census is complete, there “would still [be] the undercount problem of those people 
who just refuse to be counted or are very difficult to count.” Do you agree with 
that statement? If you disagree, please explain why. 

16. With regard to the post-enumeration survey in the 1990 census. Dr. Brown 
testified that many of the difficulties with the procedure “can be traced to the fact 
that the PES sample was much too small to support the kind of objective, reliable 
analyses that are desired.” Do you agree with that? If you disagree, please 
explain why. 

1 7. The sire of the sample in the Integrated Coverage Management (ICM) is 750,000 
households. Is that a proper sire for such an endeavor? 

1 8. The results of the PES in 1 990 showed that census was less accurate than its 
predecessor. That result was confirmed by demographic analysis, which has been 
performed on every census since 1940. We certainly know that the 1990 census 
was much more expensive than the 1980 census. Do you agree with the 
conclusion that 1990 was less also less accurate than 1980? 

1 9. Please explain the difference between net over- or undercount in the 1 990 census 
count and actual over- and undercounts (mistakes) made int he 1 990 count. 1 
know that a net undercount of 1.6% sounds relatively small but for census 
purposes, aren’t those 26 million mistakes a concern? 



234 


20. I understand that improvement in the average does not necessarily mean that there 
will be improvement in every case. In 1990, there was criticism about the strata 
being broken down by region. If statistical methods are used in 2000, with strata 
broken down by state in 2000, can we expect more states with improved accuracy 
than there were in 1990? 

2 1 . Representative Sawyer pointed out that the longer the Census Bureau is in the 
field, the higher the error rate in the information collected. I believe that 
information came from one of the many GAO studies he and his Republican 
colleagues commissioned. You have stated your concern about the Census 
Bureau not be in the field for enough days in the 2000 plan. Can you explain the 
difference in opinion? 

22. In order to address the problem of declining public response, the GAO suggested 
exploring a radically streamlined questionnaire in future censuses. Would you 
give us your thoughts on how effective this approach might be in increasing 
response, and also its effect on perhaps diminishing the usefulness of census data? 

23. In its 1992 capping report on the 1990 census, the GAO concluded that “the 
results and experiences of the 1990 census demonstrate that the American public 
has grown too diverse and dynamic to be accurately counted solely by the 
tradition ‘headcount’ approach and that fundamental changes must be 
implemented for a successful census in 2000.” Do you agree with that 
conclusion? If you disagree, please explain why. 

24. After the 1990 census, GAO concluded that “the amount of error in the census 
increases precipitously as time and effort are extended to count the last few 
percentages of the population.. ..This increase in the rate of error shows that 
extended reliance on field follow-up activities represents a losing trade-off 
between augmenting the count and adding more errors.” In the last months of the 
follow-up efforts in 1990, GAO estimated that the error rates approached 30 
percent, and that this problem was probably exacerbated by the use of close-out 
procedures. This appears to be a problem inherent to the methodology of the 
1990 census. Don’t you agree? 

Do you have any information on the error rates for information gathered using 
close-out procedures? 

Even if sampling is not perfect, isn’t its error rate well below the levels for the last 
percentages of the population using more traditional follow-up procedures? 

If this is the case, then doesn’t that logically lead to GAO’s and the Commerce 
Department’s Inspector General’s conclusion that sampling at least a portion of 
the nonresponding households would increase the accuracy and decrease the cost 
of conducting the census? 



235 


25. GAO also concluded after the 1990 census that a high level of public cooperation 
is key to obtaining an accurate census at reasonable cost. Unfortunately the mail 
response rate has fallen with every census since 1970, and was only 
approximately 65 percent in 1990. The reasons for this decline are in many 
instances outside of the Census Bureaus control, for example the increase in 
commercial mail and telephone solicitations and in nontraditional household 
arrangements. For these reasons, the Bureau is planning a public education 
campaign for the 2000 census, surpassing any previous attempts. Given the 
response in 1990, do you believe this is money well-spent? 

Do you believe that this public education campaign can succeed in arresting the 
decline in response rates? 

Even if it does, wouldn’t some use of sampling be warranted to solve the 
problems associated with reaching the last few percentages of nonresponding 
households? 

My questions and your answers will be part of the permanent record of the May 5, 1998, hearing. 

Again, thank you for your impute into this most important process. 

Sincerely, 

CarolynTJ. Maloney 
Ranking Minority Member 
Subcommittee on the Census 


cc: The Honorable Dan Miller 



236 


Philip B. Stark, Ph.D. 


1157 Cragmont Ave 
Berkeley, CA 94708-1641 


Phone: 510-54(Mr703 

FAX: 510-486*1157 

email- stark@stat.berkeley.edu 


26 June 1998 

The Honorable Carolyn B. Maloney 

Ranking Minority Member 

Subcommittee on the Census 

Committee on Government Reform and Oversight 

2157 Rayburn House Office Building 

Washington, DC 20515-6143 

Thank you for your questions of 13 May 1998. I shall answer them by number. 

1) Can you tell us about a statistical or scientific activity that you ’ve worked on that either 
worked perfectly the first time you tried it, or that didn 't work as well as you had hoped the 
first time so you abandoned the idea altogether without making an effort to improve it? 

It has happened on several occasions that I had a conjecture 1 hoped was true, tried to 
prove it, found a counterexample, and immediately abandoned it. It has also happened 
several times that the first approach to a problem I tried worked perfectly. Sometimes a 
technique “almost" works, and I try to improve it. The sampling-based (DSE) approach to 
adjusting the census did not “almost” work in 1990. The problems with the DSE are not 
minor details that can be repaired by increasing the sample size or other incremental 
refinements: the experience from 1990 suggests that the approach is unworkable, because 
its biases are so large. The biases come from failures of the assumptions on which the 
method is based, and from insurmountable practical problems in implementing the 
approach on such a large scale. The situation is analogous to finding a counterexample to 
a conjecture. Science progresses by finding counterexamples and publishing them, so 
that others can pursue more promising approaches. The experience in 1990 seems to be a 
counterexample to the hypothesis that DSE can be used to improve the accuracy of the 
census. 

2) Despite the fact that the Census Bureau made improving the count among minorities a 
major goal of the 1990 Census, the 4.4 percent differential in the 1990 undercount between 
Blacks and non-Blacks was the highest ever recorded. Experts have repeatedly said that 
spending more money on traditional methods will not reduce this differential. If not 
through statistics, how do you propose to reduce this differential? 

First of all, the 4.4 percent figure you quote is not a fact— it is an estimate, and I am 
unsure of its source. I believe it to be based on demographic analysis, which has 
uncertainty of its own. The true undercount differential is unknown. Regardless, every 


pi of 9 




237 


set of data has some limit on its accuracy. The 1990 sampling-based adjustments really 
seem to make the accuracy worse, not better. The primary problem with the census is 
non response. The single best thing that could be done to improve census accuracy and 
decrease its cost is to motivate the public, especially undercounted groups, to fill out and 
return their census forms in a timely way. This is an area in which elected public leaders 
can make a big contribution. 

If the question were “we can afford to spend x dollars on the census— how can we get the 
highest accuracy at that cost?,” the answer might involve sampling, at least sampling for 
non-response follow-up. However, the results would probably be less accurate than a full 
head count. 

3) You have mentioned your concerns about block level accuracy. Can you discuss \your] 
thoughts on the accuracy of census numbers at the state level if Dual System Estimation is 
used in 2000? Do you have any evidence that suggests that the census counts will be more 
accurate at the state level in 2000 if DSE is not used? 

My testimony concerned state-level accuracy, not block-level accuracy. The evidence that 
adjusting the 1990 census using DSE would have made the accuracy of state shares worse 
is quite strong— see the “Technical Notes” section of my 5 May 1998 written testimony. 
Based on that evidence, and my review of the details available for the proposed 2000 ICM, 

I believe the 2000 census counts would be more accurate at the state level if DSE is not 
used. Many serious problems with the 1990 DSE are present in the 2000 ICM, so the 
failure of the 1990 DSE is evidence that the proposed 2000 ICM would be less accurate than 
a simple census. 

4) Secretary Mosbacher, in testimony before both the House and the Senate, said that the 
Post Enumeration Survey would make the majority of the states more accurate. Is that 
statement correct? If so, why is his testimony so at odds with your testimony? 

I do not have a copy of Secretary Mossbacher’s testimony. I would be happy to read it and 
reply in detail if you wish. I believe that using the 1990 Post-Enumeration Survey and 
Dual System Estimate would have made state shares less accurate. 

5) The 1990 census cost 20 percent more per household in real dollars than the 1980 census. 
The 1980 census cost twice as much per household in real dollars as the 1970 census. That is 
an increase in real dollar cost per household of 250 percent with no improvement in the 
differential undercount. Does that suggest to you that spending more on traditional 
methods will reduce the differential undercount? 

I think that there must be ways to motivate more of the population to respond to the 
census by mail. That would improve accuracy, and cut follow-up costs. Whether or not it 


p2of9 



238 


would decrease the differential undercount is an empirical question that I cannot answer 
a priori. 

6) Demographic analysis showed higher undercounts of African Americans than the 
undercounts demonstrated by the Post Enumeration Survey. That suggests that the Post 
Enumeration Survey understates, not overstates, the undercount, especially for minorities. 
In other words, isn 't it likely that the 1990 census missed more African-Americans [than] 
would have been added back into the census by the Post Enumeration Survey? 

I think the primary issue is shares, not totals. Shares can be worse if people are put in 
the wrong place than if no adjustment were made. For example, suppose there are only 
two states, A and B; only two ethnicities, pink and green; and no gender. Suppose the 
census finds; 


State 

pink 

green 

total 

A 

100 

10 


B 

80 

8 

88 (44.4%) 

total 

180 (90.9%) 

18 (9.1%) 

198 


Suppose we know (from some perfect demographic analysis, perhaps) that nationwide, 3 
pink people (1.7 percent) and 1 green person (5.6 percent) are missing. Then the true 
population fraction of pink people is 90.6 percent, the true population fraction of green 
people is 9.4 percent, and the differential undercount rate is about 3.9 percent. The DSE 
says 2 pink people and 1 green person are missing, all from state A. It would appear that 
adjusting the counts is a good idea, because it makes the totals closer to the Demographic 
Analysis. The adjusted counts would be: 


State 

pink 

green 

total 

A 

102 

ii 

113(56.2%) 

B 

80 

8 

88 (43.8%) 

total 

182(90.5%) 

19 (9.5%) 

201 


The percentages of pink and green people in the overall population in the adjusted census 
are closer to those in the demographic analysis. Suppose the DSE adjustment is mostly 
bias in the DSE. In fact, the 3 missing pink people are missing from state B, and the 1 
missing green person is missing from state A. Then the truth is: 


State 

pink 

green 

total 

A 

100 

li 

111 (55.0%) 

B 

83 

8 

91 (45.0%) 

total 

183(90.6%) 

19(9.4%) 

202 


p3 of 9 



































239 


Adjustment made state shares less accurate (they are off by 1.2 percent, while the census 
was off by only 0.6 percent), even though It made the totals more accurate. 

The situation is the same for the 1990 DSE: most of the adjustment is bias, and it is 
implausible that the adjustment put the missing people more or less where they belonged. 
As a result, the adjusted state shares are probably less accurate than the census state 
shares. Even if the DSE added the right number of people nationally, it probably put 
them in the wrong places. The result is less accurate state shares. 

7) You have talked a lot about bias in the Post Enumeration Survey but have not talked 
much about the bias in the census. The differential undercount measured by demographic 
analysis shows that the bias in the census is quite real, If there is no Integrated Coverage 
Measurement, is it not the case that this bias in the census will continue? 

The census does seem to be biased at the level of national totals, and is probably biased at 
the level of state shares. The ICM is unlikely to fix the bias in the census. It just adds 
different biases. 

8) Do you believe that it is acceptable for the census to consistently miss certain segments of 
the population - [ African ] Americans, Latinos, Asian Americans, poor people in rural and 
urban communities - at greater rates than the White population? If that is not acceptable, 
what do you propose be done to reduce the differential undercount? Can you offer any 
evidence that [your\ proposals) will reduce the differential undercount? 

It is a regrettable fact that the census makes mistakes. It is a regrettable fact that DSE 
does not fix those mistakes— it just makes different mistakes. I wish the differential 
undercount could be eliminated, or at least reduced. The best way to decrease the 
differential undercount is to motivate undercounted groups to respond to the mail-out 
census questionnaires. 

9) It has been stated that one of the faults of the 1990 PES was correlation bias. Can you 
explain correlation bias? I understand that it is the likelihood that the people missed in the 
census may be the same people missed in the PES. Said another way, both the census and 
the survey miss the same people, for example, young Black makes. How does correlation 
bias affect the accuracy count of those traditionally undercounted, Blacks, Hispanics, 
Asians, Native Americans, renters? 

“Correlation bias" is a label for two kinds of failure of the hypotheses on which the DSE 
is based: (i) being “caught" by the census can influence the chance of being “caught" by 
the PES, and (ii) different individuals within a post-stratum have different chances of 
being caught either by the census or by the PES. The existence of people who are 
unreachable by both the census and the PES is a failure of the second kind. Correlation 


p4of9 



240 


bias does not affect the accuracy of the census; it is a source of error in DSE adjustments. 
Some demographers say that such unreachable people are especially likely to be in dense 
inner cities, which often have large minority populations. Because such people are 
“caught” neither by the census nor by the PES, DSE adjustment does not take them into 
account. 

10) Wouldn ’t the only risk of correlation bias be minimization of the undercount rather 
than an overestimation of the undercount? 

No. If correlation bias is different in different places, that can reduce the accuracy of 
state shares estimated by the DSE. 

11) In testimony before the Senate Committee on Governmental Affairs approximately one 
year ago, Dr. Lawrence Brown, Professor of Statistics at the University of Pennsylvania, 
stated that, “Statistical sampling methods can be used in an effective and objective way to 
assist the census process.” Do you agree with Dr. Brown’s statement? If you disagree, please 
explain why. 

I agree. For example, I understand that sampling methods are used successfully by the 
Census Bureau for quality control of interviews. 

12) Dr. Lawrence Brown also testified before Senator Thompson that the Sampling for 
Nonresponse Follow-up plan "is an objective procedure all the way around [and] has a very 
good chance of working as desired . " Do you agree with that statement? If you disagree, 
please explain why. 

The plan appears to be objective (although it involves many ad hoc choices), but it seems 
unlikely to reduce the biases in the census. I believe that sampling for non-response 
follow-up will decrease data quality, and introduce a new source of error into DSE 
adjustments. However, I am more troubled by the sampling-based DSE adjustments than 
by sampling for non-response follow-up. 

13) In addition, Dr. Brown testified that the Census Bureau 's 2000 census plan had been 
“drastically simplified and improved ... [these changes] make it possible to believe that that 
the Integrated Coverage Measurement might work as well as desired to correct the 
undercount. ” Do you agree with that statement? If you disagree, please explain why. 

I agree that the current proposal for the 2000 ICM is simpler than some past proposals, 
and that the data analysis is simpler in some respects than the 1990 DSE. The statement 
you cite is hardly an endorsement of the planned 2000 ICM: it is possible to believe that 
the proposed ICM might reduce the undercount, but I am convinced that will make state 
shares less accurate. For the ICM to improve state shares would require an implausible 


p5 of 9 



241 


cancellation of large errors. Moreover, there will never be a way to tell whether such a 
cancellation occurrs. Therefore, it cannot be shown that the ICM improves the census. 

14) With regard to concerns that the Integrated Coverage Measurement process could be 
manipulated to achieve a particular outcome in terms of the population counts. Dr. Brown 
testified that, "if all of this planning is done in advance, it is very, very hard for me to see 
how one could direct these subjective decisions towards any desired goal . " Do you agree 
with Dr. Brown that if the procedures and protocols for the Integrated Coverage 
Measurement are set forth in advance and subject to expert and public scrutiny, that it is 
very unlikely that the sampling and statistical estimation process will be subject to 
manipulation, possibly for political advantage? If you disagree, please explain why. 

I have no opinion about this. 

15) Dr. Brown also testified that even after the non-response follow-up phase of the census is 
complete, there "would still [be] the undercount problem of those people who just refuse to be 
counted or are very difficult to count. " Do you agree with that? If you disagree, please 
explain why. 

I agree. 

16) With regard to the post-enumeration survey in the 1990 census, Dr. Brown testified that 
many of the difficulties with the procedure “can be traced to the fact that the PES sample 
was much too small to support the kind of objective, reliable analyses that are desired. ” Do 
you agree with that? If you disagree, please explain why. 

The sample size was inadequate, but there were many other serious problems with the 
analysis, such as the biases discussed in my 5 May 1998 testimony. Increasing the sample 
size would not decrease those biases. It would probably exacerbate them. 

17) The size of the sample in the Integrated Coverage Measurement (ICM) is 750,000 
households. Is that a proper size for such an endeavor? 

There is no proper sample size for the ICM, because the main problem is bias, not 
sampling error. 

18) The results of the PES in 1990 showed that census was less accurate than its predecessor. 
That result was confirmed by demographic analysis, which has been performed on every 
census since 1940. We certainly know that the 1990 census was much more expensive than 
the 1980 census. Do you agree with the conclusion that 1990 was also less accurate than 
1980? 


p6of9 



242 


Because demographic analysis does not estimate state shares, it is not possible to tell 
from demographic analysis whether the 1990 census was less accurate than the 1980 
census at the level of states, or for state shares. Because of the uncertainties in 
demographic analysis, it is not clear whether the 1990 census was less accurate than the 
1980 census at the national level, but the evidence suggests that at the national level the 
1990 census was the second most accurate census, if not the most accurate census, in U.S. 
history. 

19) Please explain the difference between net over- or undercount in the 1990 census count 
and actual over- and undercounts (mistakes) made [in the] 1990 count. I know that a net 
undercount of 1.6% sounds relatively small but for census purposes, aren ’t those 26 million 
mistakes a concern? 

Net undercount is the number of people counted erroneously, minus the number of 
people who were not counted. Both of these terms are computed at the block level, not at 
the national level. That is, the same person, who really lives somewhere in the US, can 
contribute both an erroneous enumeration and a gross omission, if his or her address is 
incorrect in the census (the person will be a gross omission where the person really lives, 
and an erroneous enumeration at the incorrect address). The importance of the two 
errors depends on the geographic level one cares about: at the block level, both errors are 
important, but for such a person, the errors cancel at the national level. Overall, the 
gross omissions and erroneous enumerations in the census cancel to some degree, 
although not perfectly, when aggregated to states or the nation. The figure of 1.6% you 
cite appears to reflect some of the revisions in the PES since it was first published; I 
believe the figure of 26 million mistakes may not reflect those revisions. The large size of 
the revisions should make such estimates suspect. 

20) I understand that improvement in the average does not necessarily mean that there will 
be improvement in every case. In 1990, there was criticism about the strata being broken 
down by region. If statistical methods are used in 2000, with strata broken down by state in 
2000, can we expect more states with improved accuracy than there were in 1990? 

No. First of all, bias is probably more important than the sampling error. The bias in 
1990 was so large that, in my opinion, the 1990 DSE was not trustworthy. I have not seen 
anything in the 2000 plan that would reduce the level of bias to the point that adjustment 
reasonably would be expected to improve census accuracy. Furthermore, even though the 
proposed sample size is larger, the number of post strata is also larger, so there is a 
tradeoff that might increase the sampling error too. 

21) Representative Sawyer pointed out that the longer the Census Bureau is in the field, the 
higher the error rate in the information collected. I believe that information came from one 
of the many GAO studies he and his Republican colleagues commissioned. You have stated 


p7 of 9 



243 


your concern about the Census Bureau not [being] in the field for enough days in the 2000 
plan. Can you explain the difference in opinion? 

The quality of data will suffer if the Census Bureau tries to work so quickly that it uses 
poorly trained or less competent field workers, or allows too little time for it to be 
possible to do their work well. Data quality will also suffer if too much time goes by, 
because people move and memories fade. Therefore, I see no contradiction. 

22) In order to address the problem of declining public response, the GAO suggested 
exploring a radically streamlined questionnaire in future censuses. Would you give us your 
thoughts on how effective this approach might be in increasing response, and also its effect 
on perhaps diminishing the usefulness of census data? 

Everyday experience suggests that it is easier to get 5 minutes of someone's time than 2 
hours. Data from a shorter questionnaire could be less useful. 

23) In its 1992 capping report on the 1990 census, the GAO concluded that "the results and 
experience of the 1990 census demonstrate that the American public has grown too diverse 
and dynamic to be accurately counted solely by the [traditional] 'headcount' approach and 
that fundamental changes must be implemented for a successful census in 2000. ” Do you 
agree with that conclusion? If you disagree, please explain why. 

I believe that a headcount is the most accurate method available. Perhaps someday 
someone will devise a better approach, but the 1990 experience indicates that the DSE is 
less accurate than a headcount. 

24) After the 1990 census, GAO concluded that "the amount of error in the census increases 
precipitously as time and effort are extended to count the last few percentages of the 
population ... This increase in the rate of error shows that extended reliance onfield 
follow-up activities represents a losing trade-off between augmenting the count and adding 
more errors . " In the last months of the follow-up efforts in 1990, the GAO estimated that the 
error rates approached 30 percent, and that this problem was probably exacerbated by the 
use of close-out procedures. This appears to be a problem inherent to the methodology of the 
1990 census. Do you agree? 

Do you have any information on the error rates for information gathered using close-out 
procedures? 

Even if sampling is not perfect, isn’t its error rate well below the levels for the last 
percentages of the population using more traditional follow-up procedures? 


p8of9 



244 


If this is the case, then doesn’t that logically lead to GAO’s and the Commerce Department's 
Inspector General’s conclusion that sampling at least a portion of the nonresponding 
households would increase the accuracy and decrease the cost of conducting the census? 

The problem in reaching the last few percent does not go away with sampling— one still 
needs to reach the last few percent of the sample, or the same kinds of errors occur and 
are magnified. The likely cost savings from a 90 percent sample with complete follow-up 
in the sample, versus the 1990 approach to head counting, seems rather small, and 
accuracy would probably suffer. If follow-up within the sample is incomplete, the 
resulting errors are just magnified by the sampling ratio. Only if follow-up within the 
sample is truncated could there be significant cost savings, but that would substantially 
reduce the accuracy for the hardest households to count, which are already the biggest 
problem. For both the census and the PES, the data quality is worst for the cases that are 
hardest to follow up, and a disproportionate part of the expense is in following up the 
hardest cases. Furthermore, sampling for non-response follow-up will make the DSE even 
more difficult, and even less accurate. Data quality problems in the PES follow-up are 
magnified enormously by the DSE. Thus the problem is worse for the DSE than for a 
headcount. 

25) GAO also concluded after the 1990 census that a high level of public cooperation is key 
to obtaining an accurate census at reasonable cost. Unfortunately the mail response rate 
has fallen with every census since 1970, and was only approximately 65 percent in 1990. The 
reasons for this decline are in many instances outside of the Census Bureau control, for 
example the increase in commercial mail and telephone solicitations and in nontraditional 
household arrangements. For these reasons, the Bureau is planning a public education 
campaign for the 2000 census, surpassing any previous attempts. Given the response in 1990, 
do you believe this is money well spent? 

Do you believe that this public education campaign can succeed in arresting the decline in 
response rate? 

Even if it does, wouldn ’t some use of sampling be warranted to solve the problems associated 
with reaching the last few [percent] of nonresponding households? 

I am not expert at motivating the public, but I think that such a campaign could be very 
helpful. The details of the campaign would be crucial to its success. See my answer to the 
previous question (24) in response to the last part of this one. 


p9 of 9 



245 


Mr. Miller. And we’ll proceed now to Mr. Wade Henderson. 

Before you set down Mr. Henderson, raise your right hand. 

[Witness sworn.] 

Mr. Miller. Thank you. 

Your official statement will be included in the record, of course, 
which we have received, thank you. We have plenty of time be- 
cause there is not going to be a vote for a little while yet. 

STATEMENT OF WADE HENDERSON, EXECUTIVE DIRECTOR, 
LEADERSHIP CONFERENCE ON CIVIL RIGHTS 

Mr. HENDERSON. Thank you, Mr. Chairman. Good afternoon, Mr. 
Chairman, members of the subcommittee. I’m Wade Henderson, 
the executive director of the Leadership Conference on Civil Rights. 
And on behalf of Leadership Conference, I appreciate the oppor- 
tunity to appear before you today on what can only be character- 
ized as one of the highest priorities of the civil rights community, 
that of ensuring a fair and accurate census count in the year 2000. 

The subcommittee’s decision to hold a hearing revisiting the 1990 
census is a laudable one; and I am hopeful that in doing so, our 
Nation may move a step closer to ensuring that we do not repeat 
the same mistakes again. 

By way of background, the Leadership Conference on Civil 
Rights is the Nation’s oldest, largest, and most diverse coalition of 
organizations committed to the protection of civil and human rights 
in the United States. Today, the Leadership Conference has over 
180 national organizations representing virtually every aspect of 
the American policy and working together in a bipartisan fashion 
to resolve the pressing civil rights problems of the day. 

We have established, of course, throughout this hearing, the con- 
stitutional basis for the census count, and I think we certainly 
agree that the census is at the core of our Democratic system of 
Government? As such, the census has a profound impact on the life 
of every resident in the country. And while the primary reason for 
the collection of census data is the apportionment of representation 
in Congress, census data also provides the statistical basis for Gov- 
ernment planners, policy advocates, and private industry to shape 
future domestic policy. Now I agree with former 1990 census Bu- 
reau Director, Barbara Bryant, who observed that the census is 
about moving power and money. It is one of the most profound in- 
novations of Democratic government. 

Because the accuracy of the census directly affects our Nation’s 
ability to ensure equal representation and equal access to impor- 
tant governmental resources for all persons under our Constitution, 
ensuring a fair and accurate census must be regarded as one of the 
most significant civil rights issues facing our country today. By my 
view, the census count for the year 2000 is the “sleeper” civil rights 
issue of the 105th Congress. 

Now the 1990 census, as has been established here today, was 
both the most expensive and least accurate census in modern 
times. It certainly marked the first time in five decades that a cen- 
sus was less accurate than its predecessor. And on the basis of de- 
mographic analysis, as has been mentioned, the undercount was 
approximately 4.7 million people. 



246 


As an aside, Mr. Chairman, I saw a chart earlier today that 
pointed out that the demographic analysis revealed, in actuality, a 
5.7 percent undercount for the black population, not the 4.4 percent 
that’s been indicated at least on one chart. 

In addition, the 1990 undercount of racial and ethnic minority 
groups, referred to as the differential undercount, was the highest 
ever recorded since the Census Bureau began conducting post-cen- 
sus evaluations in 1940, missing 4.5 percent of the African-Amer- 
ican population, 5 percent of persons of Hispanic origin, 2.3 percent 
of Asians and Pacific Islanders, and over 12 percent of Native 
Americans living on reservations. 

Most disturbing, however, is how badly the 1990 census under- 
counted children. While children under the age of 18 represented 
26 percent of the total national population that year, they are 
counted for an incredible 52 percent of the undercount. But the 
undercount of these populations is only a part of the problem of the 
1990 census. The real problem of the 1990 census was in the total 
undercount. The number of individuals missed, and those individ- 
uals who were double-counted was about 10 million people. That is 
the equivalent of disregarding the entire population of the State of 
Ohio, or the State of Michigan, or most of Illinois. Moreover, the 
people missed did not live in the same communities as the people 
who were counted twice. The mistakes did not, in other words, can- 
cel each other out. Ultimately, the 1990 enumeration cost $2.6 bil- 
lion dollars, an amount double that of the 1970 census, and 25 per- 
cent greater than the 1980 census in inflation-adjusted dollars. The 
logical question, therefore, is how did such a comprehensive effort 
result in the first count known to be less accurate than its prede- 
cessor, even after spending an unprecedented amount of money? 
The answer is simply that traditional census methods were unable 
to manage the increased mobility and looser family structure of 
contemporary Americans and new immigrants. 

In 1990 the Census Bureau sent about 100 million question- 
naires to housing units. The Bureau received a mail response rate 
of approximately 65 percent, down from the 75 percent received in 
1980, and 78 percent received in 1970. The Bureau then attempted 
to physically count the remaining 35 percent of the population, or 
over 34 million cases through the use of followup census enumera- 
tors. The census enumerators had the task of visiting every non- 
responding residence in an attempt to count the Nation’s true pop- 
ulation. 

A 1992 General Accounting Office report to Congress stated, and 
I quote, “The results and experiences of the 1990 census dem- 
onstrate that the American public has grown too diverse and dy- 
namic to be accurately counted solely by the traditional head-count 
approach, and that fundamental changes must be implemented for 
a successful census in 2000.” 

The issue before us, therefore, becomes how best to uphold the 
spirit of the constitutional requirement when traditional methods 
are not adequate to make an accurate count. 

Some individuals have suggested that failing to count 1.6 percent 
of the population is not particularly problematic, and that some in- 
accuracy in the census count should be expected. But whether it’s 
elderly citizens in Sarasota, people of color in New York City, the 



247 


rural poor in central Illinois, the urban poor in Chicago, immi- 
grants in Fairfax County and Prince William County, Native Amer- 
icans, Latinos in Phoenix in Scottsdale, or poor children in Kansas 
City, each congressional district is adversely affected when the cen- 
sus misses that many people. 

Now it’s really not necessary to accept the Leadership Conference 
analysis of the failures of the 1990 census count to persuade you. 
Instead, listen to just a few words of some of your colleagues and 
their reaction to the failures of the 1990 census count. 

In an April 30, 1991, letter, Speaker of the House Newt Gingrich, 
quote, “strongly urged” Robert Mosbacher, then the U.S. Commerce 
Secretary, to adjust Georgia’s population by a figure of about 
300,000. That 300,000 figure was calculated by the Census Bureau 
through a form of sampling conducted to determine how many peo- 
ple the traditional head count actually missed. Mr. Gingrich went 
on to add, and I quote, “Needless to say, if the undercount is not 
corrected, it would have a serious negative impact on Georgia.” 

In an August 19, 1994, letter to President Clinton, 32 Members 
of the congressional “Sunbelt Caucus” including Republican Rep- 
resentatives and Senators from Virginia, Florida, North Carolina, 
South Carolina, Mississippi, Louisiana, and New Mexico, called on 
the President to, quote, “Let stand a recent decision by the Second 
Circuit Court of Appeals to overturn a lower court ruling that let 
the census figures remain unadjusted.” The Members added, and 
I quote, “A failure to win adjustment of the census has meant a 
continuing hardship for sunbelt States and regional officials. One 
must ask, therefore, what is the intent and purpose of Federal 
funding that is— -that has a population component other than to as- 
sist State and local governments in serving their actual number of 
residents? This is strictly a fairness issue.” And, we agree. 

It was precisely because the 1990 census was such a miserably 
failed census, that in 1991, Congress asked the National Academy 
of Sciences to study the viability of redesigning the Census Bu- 
reau’s methods for the 2000 census. The overarching goals set by 
Congress were to constrain costs and improve accuracy, with a par- 
ticular focus on reducing the differential undercount. 

The Census Bureau has worked hard over the past several years 
to research, test, and evaluate census methods to achieve these ob- 
jectives. It has been guided by recommendations from independent 
experts, including three panels of the National Academy of 
Sciences, the General Accounting Office, and the Commerce De- 
partment’s Office of Inspector General. 

The resulting plan for 2000 combines a more aggressive enu- 
meration effort. It doesn’t abandon it; it combines a more aggres- 
sive enumeration effort, including sending replacement question- 
naires to non-responding households, using paid advertising, de- 
signing an easier-to-understand form, and making forms available 
in public places, with modern scientific sampling techniques to 
complete the count of the final non-responding households and to 
eliminate the pervasive undercount of children, people of color, and 
the urban and rural poor. 

Mr. Chairman, I thought it was necessary to state what the 2000 
census and the Census Bureau propose to do, because from some 
of the testimony this afternoon, one may get the impression that 



248 


only sampling is being used. Sampling, of course, is being used to 
complement what is going to be the most aggressive enumeration 
effort ever undertaken by the Census Bureau. The scientific sam- 
pling methods would not substitute for an aggressive method to 
count everyone directly. Instead, as a complement to an aggressive 
enumeration effort, scientific sampling would help the Bureau ac- 
count for all residents, even those who historically have been the 
hardest to reach through traditional counting methods. 

As the statisticians testifying before me noted, the sampling 
methods used in 1990 were not perfect. The outcome was not as re- 
liable or precise as we would have hoped. I am confident, however, 
that if the decision had been made to use the adjusted population 
numbers for the reapportionment of Congress and other purposes, 
the figures would have been scrutinized more thoroughly and the 
errors, perhaps, would have been caught in time. 

Are there uncertainties associated with the Census Bureau’s plan 
for 2000? Of course there are, and anyone who says otherwise 
would be mistaken. However, just because there are uncertainties, 
does that mean that we should abandon a process that Congress, 
itself, designed to provide the best and most accurate count pos- 
sible? And that we do not make an effort to improve and refine 
techniques. 

The Census Bureau’s plan to use 

Mr. Miller. Mr. Henderson, excuse me. 

Mr. Henderson. Yes, sir. 

Mr. Miller. We have some votes going on 

Mr. Henderson. OK. 

Mr. Miller [continuing]. And what we’re going to do is recess 
and take the votes and then those that can come back — Congress- 
woman Maloney may not be able to come back. Unfortunately, it 
will be — we have five votes in a row. The first one is right now, 
and then there will be 5-minute votes. So I’m saying, I would guess 
right now, 40 minutes. I apologize for that. Can you be here when 
we come back? 

Mr. Henderson. Mr. Chairman, I’d hate to lose the opportunity 
to complete my testimony before you. However, I stand at your 
pleasure, sir, so if you are prepared to come back. I’m certainly pre- 
pared to be here for you. 

Mr. Miller. Thank you. 

Mrs. Maloney. Or as an alternative, we could have Mr. Hender- 
son lead off at our next hearing when all 

Mr. Miller. Well, we’re always going to have 

Mrs. Maloney [continuing]. The Members would be there. 

Mr. Miller [continuing]. This tight time constraint, so if you’re 
willing to come back — I’ll be back as soon as we get the last vote. 
Well, at least both of us, I hope you all can join us. Mr. Davis said 
he’ll be able to come back, too. 

Mr. Henderson. OK. 

Mr. Miller. Hopefully, Mr. Shadegg, and maybe we can get a 
couple of your members back. 

I apologize for the delay 

Mr. Henderson. It’s all right, Mr. Chairman. 

Mr. Miller [continuing]. I thought the vote was going to be a lit- 
tle later than that. 



249 


Mr. Henderson. OK. 

Mr. Miller. Thank you, Mr. Henderson. 

Mr. Henderson. Well, thank you. 

[Recess.] 

Mr. Miller. We’ll begin — return to the witness. 

Mr. Henderson, as I said a few minutes ago, we apologize for the 
delay. These things go a little longer than we thought, and I didn’t 
realize how many votes. We thought we could complete it. But I ap- 
preciate your staying and look forward to the rest of your com- 
ments and then some discussion. Thank you very much. 

Mr. Henderson. Well, Mr. Chairman, thank you very much. 
Thank you for your courtesy this afternoon. And I do appreciate, 
indeed, your willingness and the other members of the committee 
to return. Obviously, it’s been a long day for you all, in particular, 
but your willingness to come back and complete this hearing is 
much appreciated. 

Let me also say that the importance of this hearing to those of 
us in the civil rights community really can’t be overstated. My will- 
ingness to stay today was not merely to accommodate your courtesy 
in inviting me, but also to emphasize the importance of this issue — 
a fair and accurate census count to the Nation, as a whole, and to 
emphasize the importance of this issue as a genuine civil rights 
concern that’s often overlooked. And as I stated earlier in my testi- 
mony, the use of census data has such a profound impact on the 
country, as a whole, and is not genuinely appreciated by many in 
our Nation. And so my presence here today is designed to empha- 
size it. 

I will conclude my remarks by simply reminding the committee 
that the Census Bureau’s plan to address the disproportionate 
undercount is, by all accounts, the most cost-effective proposal 
under consideration. Cost, however, is not the real issue, because 
no matter how much money we throw behind outdated methodol- 
ogy, most experts agree, we will not eliminate the disproportionate 
undercount utilizing the same methods as were used in 1990. The 
deterioration in the accuracy of the census between 1980 and 1990 
cannot be attributed to inadequate funding by Congress. This is 
simply not a situation where allocating more resources will solve 
the problem, per se. 

Mr. Chairman, the original text of the Constitution indeed sanc- 
tioned a differential undercount in the census by including only 
three-fifths of the enslaved population in the enumeration. Even 
with the removal of this offensive language through the adoption 
of the 14th amendment to the Constitution, the census continues 
to miss a disproportionate number of people of color, persons living 
in rural and urban areas, particularly the poor, and children. And 
we believe that under the proposed plan for the year 2000, the con- 
stitutional mandate to count every person in our country, at least 
that spirit will be addressed by that proposal. 

Now we recognize that the census will never produce a perfect 
result; but our Nation should not accept an effort that reaches no 
further than those who are the easiest to count or those who want 
to be counted. Preventing the Census Bureau from continuing to 
develop and to explain their plans to improve upon on their past 
efforts to provide the most accurate census possible, we believe is 



250 


not in the national interest. And with that, I have concluded my 
formal presentation. 

Thank you, Mr. Chairman. 

[The prepared statement of Mr. Henderson follows:] 



251 


Mr. Chairman and members of the Subcommittee, I am Wade Henderson, Executive 
Director of the Leadership Conference on Civil Rights (LCCR). On behaif of the Leadership 
Conference, I appreciate the opportunity to appear before you today on what can only be 
characterized as one of the highest priorities of the civil rights community, that of ensuring a fair 
and accurate census count in the year 2000. The Subcommittee’s decision to hold a hearing 
revisiting the 1990 census is a laudable one; and I am hopeful that in doing so, our nation may 
move a step closer to ensuring that we do not repeat the same mistakes again 

By way of background, the Leadership Conference on Civil Rights is the nation’s oldest, 
largest and most diverse coalition of organizations committed to the protection of civil and human 
rights in the United States. The Leadership Conference was created by A. Philip Randolph, 
Arnold Aronson, and Roy Wilkins in 19S0 as an independent body to promote passage and the 
implementation of civil rights laws designed to achieve equality under law for all persons in the 
United States. 1 Today the LCCR has over 180 organizations that work in a bipartisan fashion to 
resolve the pressing civil rights problems of the day. These organizations include groups 
representing persons of color, women, labor organizations, persons with disabilities, older 
Americans, gays and lesbians, major religious groups, and civil liberties and human rights 
interests. 


1 A. Philip Randolph was the Founder and President of the Brotherhood of Sleeping Car 
Porters; Arnold Aronson was Program Director of the National Jewish Community Relations 
Advisory Council, a coalition of major Jewish organizations; and Roy Wilkins was acting 
Executive Secretary of the NAACP. 



252 


Article I, Section 2, Clause 3 of the United States’ Constitution 2 places the census at tne 
cote of our democratic system of governance. As such, the census has a profound impart on the 
life of every resident of this country. While the primary reason for the collection of census data is 
the apportionment of representation in Congress, census data also provide the statistical basis for 
government planners, policy advocates and private industry to shape future domestic policy The 
data are also then used to apportion electoral college votes to each state; to cany out 
congressional, state, and local redistricting; and to monitor and enforce compliance with civil 
rights statutes, including the Voting Rights Act of 1965, and employment, housing, lending, and 
education anti-discrimination laws. Census results also serve as the basis for the annual 
distribution of billions of dollars in federal and state funds. As former Census Bureau Director 
Barbara Bryant observed, the census is about “moving power and money... [It is] one of the most 
profound innovations of democratic government.” 3 

Because the accuracy of the census directly affects our nation’s ability to ensure equal 
representation and equal access to important governmental resources for all Americans, ensuring 
a fair and accurate census must be regarded as one of the most significant civil rights issues facing 
the country today. This was confirmed just two weeks ago at the Leadership Conference’s 
Annual National Board Meeting, when the National Board reaffirmed that ensuring a fair and 
accurate census count through the limited use of statistical sampling will remain among the 
Leadership Conference’s highest legislative priorities. 

2 The Constitution of the United States requires the Congress to conduct an “actual 
enumeration” of the “whole number of persons within each state” every ten years. 

3 Societv: Population. Politics, and Race, at 20. 


2 



253 


The 1990 census was both the most expensive and least accurate census in modem times. 



Dcoenraa) Ceraaes 


It marked the first time in five decades that a 
census was less accurate than its predecessor. On 
the basis of “demographic analysis,"* the 
undercount was 4.7 million people; the 
undercount rate of 1 .8 percent in 1990 was 50 
percent greater than the rate had been in 1980.’ 
In addition, the 1990 undercount of racial and 


ethnic minority groups, referred to as the 
“differential undercount,” was the highest 
ever recorded since the Census Bureau 
began conducting post-census evaluations in 
1940, missing 4.5 percent of African 
Americans; 5 percent of Americans of 
Hispanic origin; 2.3 percent of Asians and 


Cnimi iMcani 



Nat Undercount 


OUt. Undercount 


‘Demographic Analysis is one of the two standard methods that the Census Bureau uses to 
measure coverage, that is the extent that the official census totals cover or completely account for 
the true total. Demographic analysis is the only method for analyzing historical trends in the 
shortfall in coverage, the national undercount. In “ Report to Congress— The Plan for Census 
2000 ". Bureau of the Census, United States Department of Commerce, July 1997, Revised 
August 1997. 


’Ibid. pp. 2. 


3 



254 


Pacific Islanders; and, over 12 percent of Native Americans living on reservations. Most 
disturbing is how badly the 1990 census undercounted children. While children under the age of 
18 represented 26 percent of the total national population that year, they accounted for an 
incredible 52 percent of the undercount. 6 But the undercount of these populations is only part of 
the problem of the 1990 census. 

The real problem of the 1990 census was that the total undercount — the number of 
individuals missed and those individuals who were double-counted - was about 10 million people, 
according to evaluations by the General Accounting Office 7 That is the equivalent of disregarding 
the entire population of the State of Ohio, or the State of Michigan, or most of Illinois. 

Moreover, the people missed did not live in the same communities as the people who were 
counted twice, the mistakes did not cancel each other out. 

Ultimately, the 1990 enumeration cost 12.6 billion -- an amount double that of the 1970 
census and 25 percent greater than the 1980 census - in inflation adjusted dollars.* The logical 
question is how did such a comprehensive effort result in the first count known to be less accurate 
than its predecessor, even after spending an unprecedented amount of money? 


‘Ibid. pp. 3. 

7 “ Capping Report ” U.S. General Accounting Office, Washington, D.C. June, 1992 

‘Se e Decennial Census: Fundamental Design Decisions Merit Congressional Attention. 
U.S. General Accounting Office, Washington, D C. (GAO/T-GGD-96-37, October 25, 1995) 
pp.4. 


4 



255 


The answer is simply that traditional census methods were unable to manage the increased 
mobility and looser family structure of contemporary Americans and new immigrants. In 1990, the 
Census Bureau sent about 100 million questionnaires to housing units. The Census Bureau 
received a mail response rate of 65 percent, down from 75 percent in 1980, and 78 percent in 
1970.’ The Bureau then attempted to physically count the remaining 35 percent of the population, 
or over 34 million cases, through the use of follow-up census enumerators. These census 
enumerators had the task of visiting every non-responding residence in an attempt to count the 
nation’s true population. A 1992 General Accounting Office report to Congress stated, “the 
results and experiences of the 1990 census demonstrate that the American public has grown too 
diverse and dynamic to be accurately counted solely by the ‘traditional’ headcount approach and 
that fundamental changes must be implemented for a successful census in 2000. 10 

Some individuals have suggested that failing to count 1.6 percent of the population is not 
particularly problematic and that some inaccuracy in the census count should be expected. To 
those who are willing to settle for similar results in 2000, one may ask, how will we explain this to 
persons who are among the undercounted? What will we say to the elderly who rely on census 
data for funding of senior citizen centers and various health programs? What will we say to 
persons with disabilities who count on accurate census numbers for assisted housing programs or 
to battered women who rely on these figures for Violence Against Women formula grants? What 
will we say to Native Americans who rely on accurate census data for employment and training 


’Ibid, pp.60 
'"Ibid pp.2. 


5 



256 


programs at the Department of Labor. What will we say to the poor children who rely on accurate 
census data to fund Head Start and the school lunch program, or to the rural poor who rely on 
federal funds for rural electrification loans? And what will we say to Latinos, African Americans, 
and Asian Pacific Americans who were disproportionately undercounted in 1990, and who will be 
again, if the Census Bureau uses the same methods that were used in 1990, methods that we 
know will produce an unacceptable differential undercount? 

Whether it be elderly citizens in Sarasota, people of color in New York City, the rural 
poor in Central Illinois, the urban poor in Chicago, immigrants in Fairfax and Prince Williams 
Counties, Native Americans and Latinos in Phoenix and Scottsdale, or poor children in Kansas 
City, each of your districts are adversely affected when the census misses this many people. 


It is not necessary to accept my analysis on the failures of the 1990 census. Instead, listen 
to how some of your colleagues reacted to the failures of the 1990 census 

• In an April 30, 1991, letter. Speaker of the House, Newt Gingrich, “strongly urgefd]” 
Robert Mosbacher, then the U.S. Commerce Secretary, to adjust Georgia’s population 
figure by about 300,000. That 300,000 figure was calculated by the Census Bureau 
through a form of sampling conducted to determine how many people the traditional 
headcount missed. Mr. Gingrich went on to add, “Needless to say, if the undercount is not 
corrected, it would have a serious negative impact on Georgia.” 11 

• In an August 19, 1994, letter to President Clinton, 32 members of the Congressiona. 


"The Honorable Newt Gingrich, April 30, 1991 . Letter to United States Secretary of 
Commerce, Robert Mosbacher. 


6 



257 


“Sunbelt Caucus” - including Republican Representatives and Senators from Virginia, 
Florida, North Carolina, South Carolina, Mississippi, Louisiana, and New Mexico. - 
called on the President to “let stand a recent decision by the Second Circuit Court to 
overturn a lower court ruling that let the Census figures remain unadjusted.” The members 
added, “Failure to win adjustment of the census has meant a continuing hardship for 
Sunbelt state and regional officials. In each year of the decade, the decision affects every 
child sitting in a classroom, every person driving on a highway, every person filing for 
unemployment, every state or local government applying for revenue bonds, every elderly 
person needing health care, every local government working to clean its air, and every 
police force fighting crime. Each day, our region’s state and local governments struggle to 
serve their actual number of residents, while they receive funds based on inaccurate 
population counts in the Official Census count . ..One must ask: what is the intent and 
purpose of federal funding that has a population component other than to assist state and 
local governments in serving their actual number of residents? This is strictly a fairness 
issue.”' 1 


Mr. Chairman and members of the Committee, these are the words of your own colleagues. 
Moreover, just two weeks ago, in a forum sponsored by the LCCR on Census 2000, Matthew 
Glavin, President and CEO, Southeastern Legal Foundation which is sponsoring one of the two 
lawsuits seeking to bar statistical sampling said, “The 1990 Census was a miserably failed 
census." 13 


'Congressional Sunbelt Caucus, August 19, 1994, Letter to United States President 
William Jefferson Clinton. 

'’Matthew Glavin, President and CEO, Southeastern Legal Foundation, at the LCCR Civil 
Rights Conference, April 20, 1998. 


7 



258 


It was precisely because the 1990 census was such a “miserably failed census” that in 1991 
Congress asked the National Academy of Sciences to study the viability of redesigning the Census 
Bureau’s methods for the 2000 census The overarching goals set by Congress were to constrain 
costs and improve accuracy, with a particular focus on reducing the differential undercount. 14 

The Census Bureau has worked hard over the past several years to research, test, and 
evaluate census methods to achieve these objectives. It has been guided by recommendations 
from independent experts, including three panels of the National Academy of Sciences, the 
General Accounting Office, and the Commerce Department’s Office of Inspector General. 

The resulting plan for 2000 combines a more aggressive enumeration effort — including 
sending replacement questionnaires to non-responding households, using paid advertising, 
designing an easier-to-understand form, and making forms available in public places - with 
modem scientific sampling techniques to complete the count of the final non-responding 
households and to eliminate the pervasive undercount of children, people of color and the urban 
and rural poor. 

The scientific sampling methods would not substitute for an aggressive effort to count 
everyone directly Instead, as a complement to an aggressive enumeration effort, scientific 
sampling would help the Bureau account for all residents, even those who historically have r en 
hardest to reach through traditional counting methods. 

■ '‘Decennial Census Improvement Act of 1991, Public Law 102-125. 


8 



259 


As the statisticians testifying before me noted, the sampling methods used in 1990 were 
not perfect. The outcome was not as reliable or precise as we would have hoped. I am confident, 
however, that if the decision had been made to use the adjusted population numbers for the 
reapportionment of Congress and other purposes, the figures would have been scrutinized more 
thoroughly and the error would have been caught in time. Are there uncertainties associated with 
the Census Bureau’s plan for 2000? Of course there are; and anyone who says otherwise would 
be mistaken. 

However, just because there are uncertainties, does that mean we should abandon a 
process that Congress designed to provide the best count; that we do not make an effort to 
improve and refine the techniques. If we approached every scientific endeavor with such an 
attitude, there would be no cure for polio, no vaccination against small pox. We would be sitting 
in this room reading by candlelight. There is always a risk in deviating from the way things have 
been done in the past. However, the Census Bureau did not develop its sampling methods 
overnight. Its plan is the product of many decades of research and testing and evaluation. The 
Census Bureau is itself one of the world’s premier scientific agencies, and it has been guided by 
the nation’s leading statistical experts. 

The Census Bureau’s plan to use limited statistical sampling has been endorsed by a broad 
group of professional associations and organizations including the American Statistical 
Association, the National Association of Business Economists, the Council of Professional 


9 



260 


Associations on Federal Statistics, the Association of Public Data Users, not to mention a broad 
range of stakeholders like the National League of Cities, the U.S, Conference of Mayors, the 
National Association of Counties, the Cuban American National Council, Inc., the National Asian 
Pacific American Legal Consortium, the National Council of La Raza, the Mexican American 
Legal Defense and Educational Fund, the American Arab- Anti-Discrimination League and the 
National Association for the Advancement of Colored People. Each of these associations and 
organizations endorses the Census Bureau’s plan to address the undercount by an aggressive 
counting effort combined with limited statistical sampling methods. 

The Census Bureau’s plan to address the disproportionate undercount is by all accounts 
the most cost-effective proposal under consideration. Cost, however, is not the real issue because 
no matter how much money we throw behind out-dated counting methods, all experts agree, we 
will not eliminate the disproportionate undercount utilizing the same methods as were used in 
1990. The deterioration in the accuracy of the census between the 1 980 and 1 990 counts cannot 
be attributed to inadequate funding by Congress. This is simply not a situation where allocating 
more money solves the problem. 

Mr. Chairman, the original text of the Constitution sanctioned a ‘differential undercount’ 
in the census by including only three-fifths of the enslaved population in the enumeration. Even 
with the removal of this offensive language through adoption of the Fourteenth Amendment, the 
census continues to miss disproportionate numbers of people of color, the rural and urban poor 
and children. Under the Census Bureau’s plan for 2000, no person under the Constitution has to 
be invisible. 


10 



261 


The census will never produce a perfect result; but our nation should not accept an effort 
that reaches no further than those who are easiest to count or who want to be counted. To those 
opposed to the Census Bureau’s plan, one question must be posed, “How many more decades 
must the nation wait before trying a new method?” Preventing the Census Bureau from 
continuing to develop and explain their plans to improve upon their past efforts to provide the 
most accurate census possible would not serve this nation well. 


11 



262 


Mr. Miller. Thank you, Mr. Henderson. 

What I’d like to do now is call on Mr. Davis first, because he has 
to leave, then Mrs. Maloney and I’ll go after that. 

Mr. Davis. 

Mr. Davis of Virginia. Mr. Henderson, thank you for being here 
today, and I think your perspective is a welcome one, one which we 
value. I think we have the same goals in mind, a little bit different 
perspective. I’ve got a conclusion I’m trying to get to, and I want 
to give you time to answer the conclusion. So I want to ask a few — 
“yes or no” — that I think are pretty easy getting there, and then 
give you time to amplify when I ask you the question at the end. 

You’d agree with me that traditionally the civil rights movement 
has been about eliminating barriers to the participation of people 
in society and Government? 

Mr. Henderson. Absolutely. 

Mr. Davis of Virginia. And, that some examples of these types 
of barriers have been such odious practices as Jim Crow laws, poll 
taxes, literacy tests, denying one person one vote, preventing mi- 
norities from registering to vote, and validating the votes of minori- 
ties once they were cast? 

Mr. Henderson. Yes, sir, I would agree with that. 

Mr. Davis of Virginia. Yes. In fact, your testimony points out 
that perhaps the most egregious example of all of this was the 
counting of African-American slaves as less than a whole person? 

Mr. Henderson. Indeed. 

Mr. Davis of Virginia. It very accurately describes that. And 
would you agree that this was particularly awful that, in many 
cases, this past discrimination and violation of civil rights was ac- 
tually perpetrated by our own Government at the Federal, State, 
and local levels? 

Mr. Henderson. Absolutely. 

Mr. Davis of Virginia. And you’d agree that it was a major step 
forward for both our society and the civil rights movement when 
these types of odious barriers were removed? 

Mr. Henderson. Indeed. 

Mr. Davis of Virginia. And Government needs to continue to 
move forward to remove those barriers that deny people the chance 
to participate, like the Bureau’s 2000 census plan that purports to 
create the most accurate address list possible — printing the forms 
in 32 different languages, using a paid advertising program and 
promotion and outreach targeted toward hard-to-count populations, 
hiring census takers directly from the neighborhoods they need to 
count? You would agree that at least these steps are good steps to 
take to ensure that everybody has a chance to participate? 

Mr. Henderson. I think those steps are important steps. I would 
agree, although I am not prepared to say that that alone will 
produce * 

Mr. Davis of Virginia. I think you’ve made it clear it’s not alone. 
You have other ways and — I agree with that. And as the executive 
director of the Leadership Conference on Civil Rights, it would be 
correct to say that you’d be strongly opposed to any efforts by Gov- 
ernment to go the wrong way and put barriers back up and keep 
individuals from participating in Government? 



263 


Mr. HENDERSON. Certainly I would oppose the creation of new 
barriers although, Mr. Davis, I think one of the questions that ulti- 
mately will be raised is whether the Census Bureau in attempting 
to develop a methodology for the year 2000 that produces the fair- 
est and most accurate result, whether using all of the techniques 
that have been proposed will produce the result of creating new 
barriers. I have heard, for example, that the Census Bureau does 
propose to examine the returns of some individuals who have com- 
pleted their file and they may, in fact, compress them in some sort 
of statistical methodology. 

Mr. Davis of Virginia. Well, let me get to that. Let me get to that 
more directly and give you ample time to respond. 

As we’ve heard in earlier testimony, the Bureau’s plan to use the 
sampling in the 2000 census will involve subtracting real people 
from the census counts on the basis that statistical theory says 
that they really aren’t there, even though they have actual physical 
proof that they are. These are not duplicate forms that we’re talk- 
ing about. So in effect, they will serve to count some Americans as 
less than a whole person — the practice your testimony condemns. 
The organization — I know your organization has endorsed sampling 
and called it the civil rights issue of the 1990’s, and perhaps it is. 
But subtracting real people from the counts amounts to nothing 
more than a Government-sponsored civil rights violation, in my 
judgment, of millions of Americans who took the time to fill out 
their census forms. These Americans will be deleted, in some cases, 
from all different types of racial and ethnic groups that your orga- 
nization represents, in some cases. 

And we’re going to be introducing legislation to forbid the Census 
Bureau from removing valid, completed census forms from the 
counts through the use of statistical inference. 

And I think what we’d like to ask is; could we count on your sup- 
port for that aspect? 

Mr. Henderson. Certainly, Mr. Davis, I think it is fair to ask 
the Census Bureau to explain in totality the proposed methodology 
that it will use for the upcoming census count. And it seems to me 
that in asking the Bureau to both present and, perhaps, even to 
revalidate by demonstrating the scientific validity of what they pro- 
pose is not unreasonable. On the other hand, it seems to me there 
is a distinction between subtracting forms that may have been com- 
pleted and submitted to the Census Bureau from a comparison of 
where actual individuals have been somehow barred from being 
considered as part of the total population. 

I’d — the distinction that I would make is this; I have a knowl- 
edge, certainly, of what the Bureau has proposed as the totality of 
the methodology it will use. On the face of it, it seems to be fair, 
valid and, I believe, will produce a more accurate result than was 
the case certainly with 1990. It does not mean that in every aspect 
of what they’ve proposed that they are, you know, without a prob- 
lem. But I do think that to start from the premise that somehow 
what they are proposing to do in trying to balance out their meth- 
odology may, in fact, delete real individuals from the census count 
is perhaps a little bit of a distortion because I think that the meth- 
odology they are proposing to use meets the test that we would em- 
ploy for scientific validity. And I think it meets the test that was 



264 


proposed by the National Academy of Sciences and the General Ac- 
counting Office. 

Mr. Davis of Virginia. But wouldn’t you agree that only duplica- 
tion should take on double counts, not — sampling shouldn’t do this. 
In other words, if somebody takes the time to fill out a completed 
census form, the bill that we’re going to introduce will allow that 
it could be deleted only when an actual duplicate or fraudulent 
questionnaire is found. You talked about complementing an aggres- 
sive enumeration, and I think 

Mr. Henderson. Absolutely. 

Mr. Davis of Virginia [continuing]. We should look at com- 
plementing an aggressive enumeration, but deleting actual people 
who have filled out the forms when there’s no duplication or fraud 
involved isn’t a complement; that’s an insult. 

Mr. Henderson. Well, it seems to me, Mr. Davis, and I under- 
stand the rationale behind your bill. And I certainly believe that, 
you know, we have and you have every right to question the pro- 
posed methodology that the Census Bureau would use. On the 
other hand, it seems to me that if they are making an effort to de- 
velop a comprehensive methodology, one which is based on individ- 
ual enumeration of the largest number of people that can be done 
complemented with scientifically and a valid methodology to cer- 
tainly estimate using principles that we all believe to be valid. The 
total population 

Mr. Davis of Virginia. But the problem is not everybody is going 
to believe they’re 

Mr. HENDERSON. Well, and I think that’s certainly — no, and I 
think that’s a reasonable question, and I think you have every 
right to ask them to come in and to reestablish the basis of their 
assumptions. But I would not, you know, be at this point prepared 
to embrace the bill that you’ve identified because it only goes to a 
small portion of what the Census Bureau proposes to do. And I un- 
derstand the rationale behind it, and certainly we are not for creat- 
ing new barriers to opportunities for anyone in our society. But I 
think in the initial instance, looking at the total proposal would be 
fair to both, you know, the Census Bureau and to the 

Mr. Davis of Virginia. Well, let me know — just sum up if I 
can 

Mr. Henderson. Sure. 

Mr. Davis of Virginia. I hope you’re not saying that, because the 
overall good by sampling is helpful, that maybe one or two or five 
people who filled out the form, statistical sampling would take 
them out of it, that that somehow justifies that the end 

Mr. Henderson. No, I’m not suggesting that. What I am sug- 
gesting is that the methodology which is proposed for the year 2000 
does need to be examined. It does need to meet the tests of sci- 
entific validity and sufficiency. I don’t think that’s unreasonable. At 
the same time, I would not be prepared without looking more com- 
prehensively at what the Bureau proposes to do, to sign on to any 
bill that would seek, to inhibit one or a limited aspect of what the 
census can do to carry out their task even though I understand the 
method 

Mr. Davis of Virginia. Well, we’d hope to involve you 

Mr. Henderson. Absolutely. 



265 


Mr. Davis of Virginia [continuing]. In this and send you copies 
of this, and it just seems to me any person who fills out that form, 
and you can’t show fraud or duplication, ought to be counted. It 
shouldn’t be discounted because some sampling methodology or 
some social scientist thinks that they somehow don’t fit the meth- 
odology that they have gotten. 

Mr. Henderson. We’re certainly prepared 

Mr. Davis of Virginia. And I would think you’d be 

Mr. Henderson [continuing]. To take a look at that and 

Mr. Davis of Virginia [continuing]. Particularly sensitive to that 
and we’ll continue to correspond on that. 

Mr. Henderson. Of course. 

Mr. Davis of Virginia. Thank you very much. 

Mr. Henderson. Thank you. 

Mr. Miller. Thank you. Mrs. Maloney. 

Mrs. Maloney. I’d like to really thank Mr. Henderson for your 
testimony, and particularly for staying. 

Mr. Henderson. Oh, thank you. 

Mrs. Maloney. And being here to answer our questions. I know 
it’s a huge contribution of your time. 

What is your opinion of whether or not the 1990 census was suc- 
cessful? 

Mr. HENDERSON. I think the 1990 census was an incredibly failed 
census, by any objective standard. I mean, I think if the purpose 
of the census is to produce the fairest and most accurate count of 
all persons here in the United States, who reside here, then I think 
the increased undercount between the 1980 — I’m sorry, between 
the 1970 and 1980, I’m sorry — the 1980 and the 1990 census, it 
seems to me is a real problem. And I think, by any objective stand- 
ard, one has to believe that the 1990 census was a failed census. 

Mrs. Maloney. Do you believe that the use of promotion and out- 
reach programs such as checks at homeless centers and soup kitch- 
ens, targeted advertising, forms in multiple languages, all the new 
ideas that they propose to use in the 2000 census can significantly 
reduce the differential undercount without the use of statistical 
sampling? 

Mr. Henderson. I certainly think that those techniques to 
spread the outreach efforts of the Census Bureau are positive. Hav- 
ing said that, however, I don’t think they will be sufficient unto 
themselves to reduce the differential undercount. And I think, 
again, by most objective standards at least as I have seen it, and 
that includes the evaluations that were done over the past 8 years 
by the General Accounting Office and the National Academy of 
Sciences, it would seem to indicate that that is the case. 

Mrs. Maloney. Our country is committed to equal rights, with- 
out regard to race or ethnicity, yet we know that huge undercounts 
exist. Does the civil rights community see the census undercount 
as an equal rights issue? 

Mr. Henderson. Oh, we certainly see it as an equal rights issue. 
I mean I think if you examine the populations of persons who are 
most often left out of a census count, they include discrete and in- 
sider populations, people of color — African-Americans, persons of 
Hispanic origin. They include the poor, whether in rural commu- 
nities or in urban centers, and they include children. And I think 



266 


in each of those instances, the importance of ensuring the adequate 
representation of all of these groups, and really for that matter, all 
persons who reside in our country is really the prime directive that 
I hope, you know, that both Congress and the community at large 
will embrace. 

Mrs. Maloney. Well, can you explain or elaborate on how the 
undercount affects these groups, both in funding and political rep- 
resentation? And can you address how urban centers and minori- 
ties are adversely affected by the differential undercount? 

Mr. Henderson. Oh, I think there are many examples, Mrs. 
Maloney, that indeed make that case. I mean, as has been said 
here earlier through other witnesses, the census data is used for 
so many purposes. Obviously, reapportionment is certainly one of 
them, carrying out the responsibilities of civil right statutes such 
as the Voting Rights Act or others, and also formula-driven alloca- 
tions of Federal resources to States have tremendous implication 
for all of the populations we’ve identified. It seems to me when we 
exclude whole cell segments of our population and the failure to 
provide an accurate count of all persons who reside, we deny the 
communities in which they live, the resources that they are enti- 
tled to, to address needs and services that Congress and the Amer- 
ican people certainly have every right to expect, and, moreover, you 
deny these individuals their right fill representation in the political 
process. 

Now, admittedly, the failure to reach these populations rests 
both with the outreach effort that’s undertaken by the census and 
obviously there is a responsibility within the communities affected 
themselves to do more to ensure that there is an adequate partici- 
pation. But the truth is, that without some additional effort, which 
I think is adequately reflected in the proposal to augment the enu- 
meration with sampling, I think we’re going to continue to have 
these gaps in our population count and will continue to have dire 
consequences for all of these groups and for the Nation as a whole. 

Mrs. Maloney. You mentioned in your testimony how badly the 
1990 census undercounted children, and I would like you to elabo- 
rate on that point. Can you explain why this happened and what 
we can do about this problem? 

Mr. Henderson. Well I think that, again as was noted, there is 
great difficulty in encouraging all segments of our society to take 
the same approach to the importance of the census. We are not al- 
ways able to convey in ways that overcome the skepticism, and in 
some instances, hostility that people have about Government docu- 
menting where they reside and how this information will be used. 
Children, unfortunately, are not in the position to, of course, take 
on that responsibility themselves. They generally rely on the adults 
in the households where they reside to ensure that they are accu- 
rately counted in whatever census enumeration occurs. And you 
can’t overcome that with the kinds of techniques that the Census 
Bureau has proposed even with expanded outreach. And the failure 
to count children as a population really does have a dire impact on 
the country, as a whole, and I think we’ve documented that in the 
kinds of programs that benefit children. But it seems to me that 
without something more than has been done and, again, consistent 
with the recommendations of the National Academy of Sciences 



267 


and others, these proposals — a failure to adopt them will really cer- 
tainly produce an outcome which we know will be flawed. 

Seems to me that the question that Congress is wrestling with 
is not between a perfect system on the one hand and a speculative 
system on the other. The question is you have two flawed proposals 
in a sense. I mean one is we know a flawed proposal; that was the 
proposal that produced the substantial differential undercount in 
1990. You now have a set of recommendations that were achieved 
through the best available scientific methodology that Congress 
had at its disposal. And throughout the process, we have a consist- 
ent — almost a consensus, I think, at least within the scientific com- 
munity, on the importance of using sampling as one technique to 
augment enumeration. And failure to take advantage of that, it 
seems to me, would produce a significant shortcoming in the out- 
come that we’re trying to accomplish. 

Mrs. Maloney. Well, my time is up. Thank you very much. 

Mr. Henderson. Thank you. 

Mr. Miller. Thank you. Mr. Henderson, let me start by saying 
that we all should agree — and I think everybody here — that we 
want to have the most accurate census and minimize the 
undercount. 

Mr. Henderson. Absolutely. 

Mr. MILLER. There is no question about the goal that everybody 
should be looking for. But what we don’t want to do is have a failed 
census, because a failed census threatens our Democratic system of 
Government. Because, as you say, the census is the basis for all of 
most elected officials in America. City councils, school board dis- 
tricts are all adjusted by the census. 

I know you refer to the 1990 census as a failed census. I, respect- 
fully, don’t disagree with you. What was a failure in 1990 was the 
attempted use of sampling. And I think most people will acknowl- 
edge sampling was a failure in 1990. They did a full enumeration, 
and then they did this what was called a PES sample of 167,000 
households. Based on that, they wanted to adjust the census, and 
they were going to take a congressional seat from Pennsylvania 
and one from Wisconsin. This was back in 1991, and Secretary 
Mosbacher refused to do that. It turned out the following year, they 
found it was a computer mistake, and they should never have 
made that recommendation. 

The Census Bureau has also stated that the data is less accurate 
when you get down to 100,000 or less population. So, basically, in- 
formation you work with on census tracts and census blocks, and 
certainly for smaller communities, is less accurate. These are the 
Census Department’s own — of their own analysis. 

So sampling was a failure in 1990, and what scares us is to to- 
tally rely on sampling without any fallback is, to me, irresponsible. 
At least in 1990, we did have the census, the full enumeration, be- 
cause right now what they’re talking about doing is no full enu- 
meration. They’re only going to count 90 percent, and then do a 
sample of 750,000. They’re doing a five times larger sample, but in 
half the time. 

Mr. Henderson. Sure. 

Mr. Miller. Which is hard to say that they can achieve it. And 
when you mentioned a speculative system, this is very speculative 



268 


because the one chance we’ve had to use sampling was a failure in 
1990. 

With respect to the undercount issue, we all need to address it. 
I don’t know if you know it as a fact that the percentage of blacks 
counted in the 1990 census was better than 1980. The percentage — 
and this is Census Department numbers — the percentage of blacks 
counted in 1990 was better than 1980 and 1970. The 1990 census 
was the second best in the history, better than 1970 and better 
than 1960; 1980 was a better census, though, when we have stat- 
isticians talking about it, they’re questioning the degree of the 
undercount. But we know there’s an undercount and we need to do 
everything we can to correct the undercount. 

Let me ask the question now. [Laughter.] 

And I’m not a lawyer; I don’t know if you are or not. But, at any 
rate, the question is — and this committee is not going to spend a 
lot of time with that issue — is by the constitutionality and the le- 
gality of sampling. And you are all familiar with that issue. I think 
you had this person that was involved in the issue at your panel 
that day. 

Mr. Henderson. Yes. 

Mr. Miller. Just assume we don’t want to talk about the legal 
issue. 

Mr. Henderson. Sure. 

Mr. Miller. If the Supreme Court said sampling can’t be used, 
what we have to do is do the very best census we can. We have 
to do everything we can to minimize that undercount and put 
whatever resources we need to in going after the undercount. We 
know part of the problem, 50 percent of it, is the address list. We 
know the children issue. 

Mr. Henderson. Sure. 

Mr. Miller. I don’t know if they use it in the WIC program? 
There’s a lot of programs we can use to get on it. Do you have any 
comments? Running ads in Time magazine may not be the answer, 
but there are some ideas out there. I mean, because you really, 
even if you’d sample, you need to get the best percentage in com- 
pletion as you can. 

Mr. Henderson. You do. Let me say, Mr. Chairman, that I think 
the constitutionality of sampling has at least been implicitly ad- 
dressed in some of the litigation in lower courts that has come be- 
tween the 1990 census and today. And I think the courts have rec- 
ognized that Congress had the authority to delegate to the Sec- 
retary of the Commerce the ability to employ both a post enumera- 
tion survey and sampling, if he chose to do so. And even though 
there was a challenge to Secretary Mosbacher’s authority to adjust 
the 1990 census, it was not addressed on the basis of constitu- 
tionality. I think the courts have spoken to that issue pretty au- 
thoritatively, and I think it is unlikely that they will rule that sam- 
pling is not constitutional. But let’s put that aside for a minute. 

I do think there is a question of what happens in the event the 
courts were to rule in that direction, and what is it that we do? It 
seems to me that we do precisely what we are going to do in 1990 
even with the addition of sampling, which is to say that the Census 
Bureau, in conjunction with as many national groups, stakeholders, 
those that have an interest in producing a fair and accurate count 



269 


which is literally every entity that we’ve identified, you make the 
best effort one can to ensure that you get a full enumeration to the 
extent possible. But we recognize, even with those best efforts, 
there will be a tremendous gap between what we are able to ac- 
complish with our best efforts and the total population that needs 
to be counted. And the question becomes whether there will be 
methodologies available to the Census Bureau to address that un- 
known factor. And as I said, I mean you know, I think you begin 
with the premise that the 1990 census was a flawed census but you 
attribute that failure to the use of sampling. I think there were a 
combination of factors, and perhaps the inappropriate application 
of some aspects of sampling may have been among them. But I 
think a couple of things are true here. 

First, in the wake of that admitted debacle that we all agree was 
just, you know, a problem, Congress did authorize a process to try 
to bring the best and brightest to the table to analyze prospectively 
what could be done. And that result produced, seems to me, a set 
of recommendations that were adopted by the Academy of Sciences, 
but also by others who have examined these issues closely and who 
have no political “ax to grind,” in terms of how this issue is re- 
solved. And I’m not suggesting, by the way, that any member of the 
committee has that ax. I do think, however, that when scientific or- 
ganizations are asked to examine the methodology that they might 
employ, one can assume that they are at least not looking at it in 
quite the same political vein that, you know. Members of Congress 
and others who are directly affected by this issue might. 

The recommendations that were made were then examined care- 
fully, by both the Census Bureau and by Congress. The General 
Accounting Office examined these issues. The Inspector General let 
the Department of Commerce and every professional association 
having some involvement in the use of statistics or demographic 
data examine this. And they came to the same conclusion; that an 
enumeration augmented by a sampling approach was the best and 
soundest use and most effective use of resources available to us. 
And I think even if Congress were prepared to invest substantial 
resources above and beyond what has already been allocated, most, 
you know, fail the objective observers of this system would suggest 
that that’s not going to be adequate. So, I think if the courts were 
to rule that sampling was not constitutional, they, at least for a 
time, would be consigning us to a flawed and inaccurate count. And 
that certainly would be the case in the year 2000 and, perhaps, be- 
yond as well. 

Mr. Miller. Well, if we go with full enumeration we’ll have to 
at least work together to try to make sure 

Mr. Henderson. Absolutely. 

Mr. Miller [continuing]. That the undercount is corrected, be- 
cause we all want to work toward achieving the minimum, if no 
undercount whatsoever, and get the best census we can. 

Mrs. Maloney. Can I ask him one followup question? 

Mr. Miller. Sure. 

Mrs. Maloney. You were saying, Mr. Henderson, that no matter 
how much money was spent on more enumerators or even more 
promotion and outreach, that it would not improve the accuracy of 
the census count — is that what you’re saying? 



270 


Mr. Henderson. Well, no; I think it will have some impact on 
improving the overall count. Certainly I think that by investing 
more resources and enumerators and public education and out- 
reach efforts, it is bound to have some positive effect. However, it 
will not be sufficient unto itself to deal with a differential 
undercount which we know existed in 1990. 

Mrs. Maloney. No matter how much you spend? 

Mr. Henderson. No matter how much you spend. 

Mrs. Maloney. I’d like to submit questions to the record, if I 
could, for panel two. And I really would like to end, if I could, very 
briefly with a question that the chairman and I were talking about 
when we walked down to vote. And I asked him what the next 
hearing would be on, and he said he really didn’t know, but it 
might be on how we would reduce the differential undercount. And 
you’ve touched on it, but I’d just like to be very clear on it. Other 
than using statistical methods and that which we know is in the 
plan for the 2000 census, can you think of any way to reduce the 
differential undercount? 

Mr. Henderson. Mrs. Maloney, I have really wrestled with this 
issue for quite awhile. I am not aware of other approaches that are 
likely to bear greater fruit in this effort than what has already 
been proposed by the Census Bureau. And obviously, we are 
searching collectively for any and all techniques and methodology 
that would augment the actual number of persons counted so that 
the need for, you know, scientific sampling and other techniques 
would not be as great as it is today. I just can’t think of any other 
approach. And I certainly think that the recommendations that 
have been made by the National Academy of Sciences and others, 
until they have been proven to be or shown to be really ill-con- 
ceived, I think are the best evidence that we have available of what 
can be used effectively to increase the accuracy and fairness of the 
count. 

Mrs. Maloney. Thank you very much. 

Mr. Henderson. Thank you. 

Mr. Miller. As we conclude, the one concern we have, and as 
other witnesses were saying earlier, is that we’re changing ways in 
sampling that one type of error for another type error and some 
statisticians will say we have a less accurate census. We don’t want 
a less accurate census. 

Mr. Henderson. Of course not. 

Mr. Miller. We want to get the most accurate that we can and 
minimize the undercount. So we have a common goal. 

Let me thank you, again, for being here today. I’m sure we’ll be 
having an ongoing discussion on this issue for the next couple of 
years — [laughter}— I ask unanimous consent for the record to re- 
main open for 2 weeks for Members to submit questions for the 
record and that witnesses submit written answers as soon as prac- 
tical. Without objection, so ordered. 

That was my housekeeping duty. Thank you very much for being 
here today and the meeting will stand adjourned. 

Mr. Henderson. Mr. Chairman, thank you. 

Mrs. Maloney. Thank you, Mr. Miller. 



271 


Mr. Miller. Thank you. 

Mrs. Maloney. Thank you, Mr. Henderson. 

Mr. Henderson. Thank you. 

[Whereupon, at 6:53 p.m., the subcommittee adjourned subject to 
the call of the Chair.] 

O 



